These pages were printed from Justin's website. To visit the online version follow: https://justinnk.net/blog/ai-opportunity-and-challenge/v1/.

Generative AI in Academia: An Opportunity, A Challenge, and the Need for Open Models

; GenAI science personal note


It is clear that the current AI wave, characterized by generative models, is going to have a lasting impact on how our world will work. This is a quick write-up of some of the things I personally thought about and discussed with others in recent months, when it comes to its impact on the academic world. It started as a piece to collect and organize my thoughts, but I also felt the need to spell it out for others to get started on thinking about this important topic as well. The premise of this post is that, on the one hand, generative AI is a step toward dealing with exponentially increasing amounts of papers, a day-to-day struggle of scientific work. On the other hand, it could even accelerate this development and lead to new iterations of hurdles that the scientific community just started to properly overcome. And finally, there is also some ethics to be considered that lead to a strong advocacy for properly open generative AI models specialized to scientific needs, developed and maintained by scientists for scientists.

The Opportunity: AI for Discoverability

The scientific community is growing, but the number of publications per year is growing even faster. It is very much out of hand, at least for the average scientist and especially young scientists (like me). The reasons behind this are multi-faceted, including metrics we impose upon ourselves, financial interests of publishers, and growing competition. Performing a literature search on any topic of high interest (e.g., “large language models”, “sparse identification of non-linear dynamics”), often yields so many results there is no way to read it all and do it justice. For a long time it has not been enough to read the latest issues of your favorite journals or even the latest literature review to stay at the frontier. Even worse, some quality publications may be buried so deep that they are not discovered; buried in a haystack of high-throughput micro-step publications that do not really consider what has been done before. It is clear that the increase in publications is making a rigorous consideration of prior art so difficult that it has a high potential to lead to a stagnation in progress in the medium term.

Although, already before my time starting in academia it must have been a problem that different people used vastly different language to refer to the same idea. In the end, each field has its jargon. Hence, essentially since I started in academia, I wished for a program which would allow me to ask: “Has somebody done X before?” Such a program would need to understand the relation between a myriad of concepts and have access to the largest possible library of publications. Then it could allow a much more rigorous exploration of related work than possible when searching by just index terms and also in much less time. This would even go beyond already helpful “classical” bibliographic tools that rely on semantic analysis, such as the (pre-LLM version of) Semantic Scholar.

“Has Somebody Done X Before?”

This is where I believe generative AI will be a huge opportunity for science. And it is finally here: besides the big commercial platforms like OpenAI’s Chat GPT and Google’s Gemini releasing “Deep Research” functionalities, I was delighted to see the release of Google Scholar Labs a couple of months ago. It chooses the sober “Has somebody done this before?"-style approach, without marketing for replacing literature reviews or complete automation of the scientific writing process. Of course, Semantic Scholar and other tools also followed through and now integrate generative AI. There are also developments like OpenScholar and Galactica in this direction. Extrapolating what might be achievable, such tools could allow a very fast and directed search, directly proposing the seminal papers and allowing us to discover “hidden gems”. Although bias could be a problem here, hallucination is not, if just using a model for querying a search engine and listing results. It might become an issue when relying on generated summaries, though. However, this is not really necessary as the abstract is often the most polished part of a paper and short enough. Authors may use generative AI even as a support to arrive at better abstracts based on their first attempts, but under their close supervision.

A generative AI driven search of this kind is a big step in overcoming the explosion in publications and averting a crisis in discoverability (besides other measures, such as new metrics for ranking). However, at the same time, generative AI could also become its primary driver.

Pressure on the Scientific System

Since one or two years now, I have heard of conferences beginning to struggle as well under the flood of papers coming in—and so does the review system1. As a band-aid solution, generative AI models are increasingly employed to write “peer”-reviews as well. It may be happening right now that a generative AI model is reviewing the very paper it spit out before, which is expected to lead to superficial reviews and also reinforcement of bias, at least when using unspecialized tools. Current developments only put gasoline into this fire. For example, a couple of days ago, OpenAI released its Prism software, a LaTeX editor with an integrated chatbot that can be used to revise sections of the text, create diagrams from napkin sketches, and search for references. This is, in my eyes, an incredibly useful tool for scientific writing! Except, the danger is that it will most likely not only be used in this ideal capacity: People will just generate whole bodies of text, blindly accept the references it proposes (which seem to mostly stem from the arXiv preprint server), and generate fake diagrams (see for example figure 2 in this retracted paper that made the news early on).

The scientific system is like a boiler under constant pressure. Pressure on the scientists to stay relevant, publish more, and do so in less time. In our world, not only in science, scaling well is the primary measure of performance (evident, e.g., in the exponential increase in amounts of papers). Not only do we need to keep up to speed, but also increase it. Constant progress is often not enough. Many external factors lead toward this and the new one on the horizon is the advent of generative AI. It drastically increases the pressure, as everybody is now enabled to be faster than ever before. Generative AI means exponential scaling—but by risking loss of rigor and quality. We need to ask ourselves: Up to which point can our boiler stay in one piece? How much scaling is humanly possible without compromising what science boils down to? There is a strong argument to be made that scaling the scientific system by just working faster and probably less accurately is not sustainable, as every new discovery relies on a solid existing foundation of knowledge. Alternatives include increased workforce and collaboration, but of course these scale slower. It thus seems very unlikely that the widespread adoption of generative AI in all parts of science will stagnate, adding to the list of external pressures we have to accept and deal with.

The Challenge: Reducing Scientists to Validators?

Automation in science is not a bad thing. In fact my own research relates to this topic as well. However, it needs to be done right. As a recent perspective article concludes: “[…] for LLMs to serve as relevant and effective creative engines and productivity enhancers, their deep integration into all steps of the scientific process should be pursued in collaboration and alignment with human scientific goals […]”. When taking the current trends to their extreme, humans cannot keep up with the pace, reducing their role to observers and validators that can sometimes intervene in a process that is mostly out of their control. Whether this is a bad thing depends on the perspective. In the end, the world has observed similar transformations before, arguably leading to (overall) better conditions for everyone. However, these were repetitive tasks being automated. Tasks which did not appeal to the breadth and creativity of the human mind. On our current path, it seems, generative AI is taking away not the remaining repetitive elements, but rather the creativity. Additionally, we are sold “reasoning” that is in the best case approximate, and just plain wrong in the worst case. So, we need to ensure it stays a tool, for now. An extension of our own creativity, to bring ideas to life that were previously out of scope for a single human. And like with any tool, we need to be in charge of it and critically examine the output.

From my personal experience, every output of current generative AI tools needs to be checked. One cannot blindly rely on current general tools to output valid references and citations, facts, code that does the right thing, information extracted from Pdf files, or sometimes even putting the commas in the right place. Just to give some examples to substantiate this claim, when I tried to extract data like titles from Pdf files in the past, in some cases the title would start correctly, but then deviate with a plausible, but fake continuation. This makes it especially dangerous, as such situations might be easily overlooked. Also, a generative AI based spell-checker I used to correct this very blog suggested placing commas where it made no sense, it must have misinterpreted the context somehow. To some extent, more advanced prompting can help with this, but in the end, it is a stochastic process. And it is precisely the rare cases where it fails, when we already lulled ourselves into a sense of security, that pose a risk.

Completing the Cycle: Generative AI Models as Gatekeepers

So, scientists are increasingly entangled in dealing with generative AI, both in reviewing others’ work and also—not always by choice—using these tools themselves. Essentially, this way, they become training data curators for generative AI models. Models which are trained at their expense and at the same time sold to them as an increasingly necessary service, required to stay on top and relevant. If everybody is using AI to write grant proposals faster, no one can afford not to do so. Just like the publishing industry could request outrageous publication fees or keep research behind paywalls, this will move a significant amount of capital from the research community to a range of profit-driven AI companies in chase of another billion dollar market based on volunteers. This will bring along similar issues, like the exclusion of institutions from poorer parts of the world. The problem here is not necessarily the use of AI models per se, not even their use as assistants for writing papers or reviews. It is that these companies charge a fee for a service that is based on our own work, that would not be possible without it. They would effectively become the custodians of scientific work. To be fair, there is an effort involved in training and hosting models, just as there is an effort involved in typesetting, categorizing, and hosting a journal paper. The question is: Will the price we pay reflect this effort and who do we choose to control our data?

This is a strong case for the scientific community to take control of this matter itself. We need properly open generative AI models that are funded and maintained by institutions, that are not taking part in a system where the ultimate and only goal is to increase profit. It happened before with the publishing industry and it took science decades to even begin to recover. And chances are, we are on the verge of it happening again, if not careful. Open-weight and open-source models, like those from Mistral AI, free search engines like the mentioned Semantic Scholar and Google Scholar Labs, and developments like Galactica are a first step. And this trend needs to continue.

Concluding Remarks

Obviously, the above depiction is rather simplistic to serve conciseness. And this being a blog article, some emotion and subjectivity is involved. Many of the discussed topics are much broader in practice. However, all this leads to an important point. I feel the scientific community, albeit large, has been, and still is, vulnerable. Both, to outside actors and external pressures, and also to its own dynamics, that we let freely unfold with only limited counteracting. It is basically a complex adaptive system where impacting the collective behavior is very difficult as it starts with every individual’s actions. I argue we need to take further steps toward more control. Over the flood of papers. Over how we work (humanely). Over who controls access to what we produce.

How to achieve this? Here are just some quick ideas. A first step is doing what scientists excel at, open discussion. Also everybody can, for themselves, critically reflect on where and when to use these tools and when not to. In any case, everyone should check their output line-by-line. Institutions with power need to take preventative actions to overcome the challenges brought by generative AI, but also embrace its positive, supportive sides. In the end, it looks like it is here to stay.


  1. All em-dashes in this article were mindfully placed by a human agent. ↩︎