Remember in late 2022 when ChatGPT arrived on the international scene and you communicated with AI through a simple chat bot interface? It was remarkable that you could type in relatively short prompts and it would instantly type back directly to you—a machine with communication capability!
For most of us, this remains the most common daily mode of accessing and utilizing AI. Many of us are using AI only as a replacement for Google Search. In fact, Google Search AI Overviews, are now a standard feature, which was announced last year for a significant portion of users and search queries. They appear at the top of the results, and only after allowing you to follow up with a deeper dive are you taken to the old list of responses. As of mid-June 2025, the rollout of AI Overviews has progressed to the point where these overviews are a common sight at the top of search results pages. Yet the whole world of communication is open now for most of the frontier models of AI—and with the new communication modes comes a whole world of possibilities.
In order to more fully utilize the remarkable range of capabilities of AI today, we need to become comfortable with the many input and output modes that are available. From audio, voice, image and stunning video to massive formally formatted documents, spreadsheets, computer code, databases and more, the potential to input and output material is beyond what most of us take for granted. That is not to mention the emerging potential of embodied AI, which includes all of these capabilities in a humanoid form, as discussed in this column two weeks ago.
So, what can AI do with images and videos? Of course, you can import images as still photographs and instruct AI to edit the photos, adding or deleting objects within the image. Many apps do this exceptionally well. This does raise questions about deepfakes, images that can be shared as if they were real, when actually they are altered by AI in an attempt to mislead the public. Most such images do carry a watermark that indicates the image was generated or altered by AI. However, there are watermark removers that will wash away those well-intended alerts.
One example of using the image capability of AI is in the app PictureThis, which describes itself as a “botanist in your pocket.” As one would expect, you can upload a picture from your smartphone and it will identify the plant. It will also provide a diagnosis of any conditions or diseases that it can determine through the image, offer care suggestions such as optimal lighting and watering, point out toxicity to humans and pets, and provide tips on how to help your plant thrive. In education, we can utilize AI to provide these kinds of services to learners who simply take a snapshot of their work.
We can build upon the PictureThis example to create a kind of “professor in your pocket” that offers enhanced responses to images that, for example, might include an attempt to solve a mathematical problem, develop a chemistry formula, create an outline for an essay and much more. The student may simply take a smartphone or screenshot of their work and share it with the app, which will respond with what may be right and wrong in the work as well as give ideas of further research and context that will be helpful.
Many of us are in positions where we need to construct spreadsheets, PowerPoint presentations and more formal reports with cover pages, tables of contents, citations and references. AI stands ready to convert data, text and free-form writing into perfectly formatted final products. Use the upload icon that is commonly located near the prompt window in ChatGPT, Gemini, Claude or other leading models to upload your material for analysis or formatting. Gemini, a Google product, has direct connections with Google apps.
Many of these features are available on the free tier of the products. Most major AI companies have a subscription tier for around $20 per month that provides limited access to higher levels of their products. In addition, there are business, enterprise, cloud and API levels that serve organizations and developers. As a senior fellow conducting research, I maintain a couple of subscriptions that enable me to seamlessly move through my work process from ideation to creation of content, then from content creation to enhancement of research inserting creative concepts and, finally, to develop a formal final report.
Using the pro versions gives access to deep research tools in most cases. This mode provides far more “thinking” by the AI tool, which can provide more extensive web-based research, generate novel ideas and pursue alternative approaches with extensive documentation, analysis and graphical output in the form of tables, spreadsheets and charts. Using a combination of these approaches, one can assemble a thoughtful deep dive into a current or emerging topic.
AI can also provide effective “brainstorming” that integrates deep insights into the topics being explored. One currently free tool is Stanford University’s Storm, a research prototype that supports interactive research and creative analyses. Storm assists with article creation and development and offers an intriguing roundtable conversation that enables several virtual and human participants to join in the brainstorming from distant locations.
This has tremendous potential for sparking interactive debates and discussions among learners that can include AI-generated participants. I encourage faculty to consider using this tool as a developmental activity for learners to probe deeply into topics in your discipline as well as to provide experience in collaborative virtual discussions that presage experiences they may encounter when they enter or advance in the workforce.
In general, we are underutilizing not only the analytical and composition capabilities of AI, but also the wealth of multimode capabilities of these tools. Depending upon your needs, we have both input and output capabilities in audio, video, images, spreadsheets, coding, graphics and multimedia combinations. The key to most effectively developing skill in the use of these tools is to incorporate their time-saving and illustrative capabilities into your daily work.
So, if you are writing a paper and have some data to include, try out an AI app to generate a spreadsheet and choose the best chart to further clarify and emphasize trends. If you need a modest app to perform a repetitive function for yourself or for others, for example, generating mean, mode and standard deviation, you can be helped by describing the inputs/outputs to AI and prompt it to create the code for you. Perhaps you want to create a short video clip as a simulation of how a new process might work; AI can do that from a description of the scene that you provide. If you want to create a logo for a prospective project, initiative or other activity, AI will give you a variety of custom-created logos. In all cases, you can ask for revisions and alterations. Think of AI as your dedicated assistant who has multimedia skills and is eager to help you with these tasks. If you are not sure how to get started, of course, just ask AI.