Introduction to AI
In this article, I'll describe, explain, and demonstrate how AI agents can be used to automate keyword research, given only an article idea (as an example). I'll also describe how these agents work, make decisions, and their different types and varieties.
I've always been skeptical of new technologies, distrusting hype and trends. Especially AI. It seemed cool, but for me, it wasn't very useful.
Sure, it was fun to just chat for a while, but it got boring, and I won't even mention the errors, inaccuracies, and overconfidence in what they generate. They're only as good as the initial training data. And what data do they train on? That's right, websites and their content.
The thing is, intent is determined by the search algorithms of the search engine being asked (Google, Bing, Yandex, etc.). More precisely, users determine the intent of a particular search query by selecting and spending more time on resources that best match the intent specified in the query.
But if you ask, "Describe in detail the course and results of the 1804 'Glass War' between Switzerland and Mongolia," the AI will try to play along and can very confidently answer what actually happened in 1804.
I understand that this example is very exaggerated, and even the most modern models won't buy it, but in any case, you should always check what they've generated for you. Although, to be fair, it should be said that the same applies to anything written and published online, because it's the internet. Anyone can write anything, however they want.
It's precisely this ability to quickly process data that makes AI an ideal candidate for automation. But to turn a simple chatty bot into a reliable assistant, we need to give it autonomy and the right to act. This is where AI Agents come in.
What is an AI Agent and what does it look like?
According to an internet definition that is definitely not AI-generated, an AI Agent is:
In other words, to consider a regular Python script an AI Agent, we need:
- A language model—what will actually generate the corresponding output.
- Tools—what enables the language model to interact with the outside world.
- A workflow control system (The Orchestration Layer)—that is, a system that controls the "Think, Act, Observe" cycle—a state machine, if you will, that determines which model to use and what tools to provide to that model.
In the next chapter, I'll describe creating an AI Agent using the LangChain library. To create an agent, you'll need to specify the tools, the model, and some context. In this case, the library itself plays the role of the "Workflow Control System."
How agents work and think
Fundamentally, all agents operate in a special cycle called "Think, Act, Observe." It consists of five key steps:
- Get a Mission — the agent sets an initial goal for the agent to act upon. This mission can be defined by either the user or the developer during creation.
- Context Analysis — at this stage, the agent analyzes the entire context provided, including the developer's initial prompts, descriptions of the provided tools, and what's in its temporary memory.
- Think — the model now begins to think about solving the user's request, developing a plan for further action, and, accordingly, which tools can be used.
- Act — the agent, having selected the necessary tools, begins to use them in accordance with the requirements for completing the given mission.
- Observe and Repeat — each tool must have an output; this output is added to the context, and the agent returns to step 3 to analyze the new, updated context.

AI Agent Lifecycle. Taken from: https://www.kaggle.com/whitepaper-introduction-to-agents
What kinds of agents are there?
In the previous chapter, I showed how an agent works in its most basic, simple form. But did you know that agents have a hierarchy and can become even more complex and intelligent. As a result, they can solve much more complex problems.

Hierarchy of Agent Systems. Taken from https://www.kaggle.com/whitepaper-introduction-to-agents
Level 0. The core reasoning system. A large language model on its own.
- Example: You asked Gemini, "Write an essay about the impact of the Industrial Revolution on the environment."
- Concept: The model uses only its internal knowledge, acquired during training. It doesn't access the internet or run code. If it doesn't have this knowledge or it's outdated, it will start hallucinating.
Level 1. The connected problem-solver. At this level, tools are attached to the language model, allowing it to explore and interact with the environment.
- Example: You asked, "What's the weather like in Warsaw right now?"
- Concept: The model understands that it doesn't know the answer, but it has a tool (function) called get_weather(city). It calls this tool, receives data, and generates a response in natural language. This is where the connection with the outside world begins.
Level 2. The strategic problem-solver. At this level, the agent acquires a context, and this context can be used to limit and frame the agent or customize its behavior.
- Example: The agent we create in the article (SEO specialist).
- Concept: You don't just provide a tool; you define a role and context: "You are an SEO specialist, your goal is to expand the semantic core." The agent remembers its mission and can perform several actions in a row (find competitors -> extract keywords -> filter) within a single session, maintaining the task context.
Level 3. The Collaborative Multi-Agent System. At this level, different agents with their own tools and contexts unite to solve a common problem.
- Example: "Blog Editing."
- Concept: There are three agents: the Researcher (searches for topics online), the Copywriter (writes text based on what they find), and the Editor (checks the text and corrects style). You assign the task "Write an article about AI," and agents pass the work down the chain, like coworkers in an office, until the result is ready.
Level 4. The self-evolving system. This is a system that is able to understand its own shortcomings and limitations and create additional tools and/or agents to address them in order to complete the task.
- Example: A travel booking agent faced with a new task.
- Concept: You ask the agent to book a hotel on a specific website whose API the agent doesn't know. A Level 4 agent won't crash. It will read the website's documentation, write a new tool (code) to interact with the API, test it, and complete your task.
A couple of notes on some levels. Fundamentally, you can equate levels 2 and 3. So, instead of creating dozens of separate agents with their own tools, you can simply create all the necessary tools and delegate them to a single agent. This is much simpler to implement. You don't have to worry about connecting multiple agents and how they will interact.
By this logic, a self-adaptive system doesn't necessarily have to consist of multiple agents. It can simply create or search for the necessary tools to perform its tasks.
However, there are a couple of limitations that prevent the approach I described above.
- First, context. Each agent has a limited context, and therefore, it's impossible to cram all the tools into a single agent. That's why they (agents) will have to be multiplied.
- Second, agents lack long-term memory. With each new interaction, the context will have to be redefined and trained to perform a particular task.
Later in this article, I'll demonstrate the creation of a Level 2 AI Agent, as they are currently the easiest to develop and suitable for most routine tasks. At least mine are. And this is certainly not because I don't yet know how to create Level 3 and 4 agents.
And as you may have noticed, starting from Level 1, the key difference between an agent and a simple chatbot is the ability to interact with its environment. And they do this with the help of tools. Let's figure out what types there are.
About tools for AI agents
Tools are functions that the Large Language Model can use to perform assigned tasks. All tools can be roughly divided by the type of action:
- Those that get something are used to obtain information, such as querying a database.
- Those that do something are used to perform actions on existing systems, such as creating a database entry.
These tools can also be divided by their origin:
- External tools are those that are not built into the Language Model and must be specified when creating an agent.
- Built-in tools are those that the Language Model can use without explicitly specifying them in the toolchain when creating an agent. For example, Gemini has the following built-in tools: Google Search, Code Execution, URL-Context, and Computer Use. More built-in tools for Gemini can be found here.
- Agent as a tool - i.e., any agent can also be used as a tool.
When developing tools for your agent, you should also consider the best practices:
- Document everything thoroughly—that is, use clear names, describe all input parameters, explain what your function returns, provide default values, and add examples. This will all be contextualized, and the agent will need to understand what the tool is intended for.
- Describe the result, not the execution flow—that is, explain what needs to be done, not how to do it, and don't try to explain the sequence of actions.
- Tools should be simple—don't overcomplicate things or write functions with multiple purposes.
- Build tools so that they don't return large volumes of data, but only the essentials—for example, instead of returning a table with a large number of rows, simply store the table name for future reference.
And this is just the basics. It all seems very complex and confusing, but in reality, you can build your first agent (level 2) in 15 minutes. Let's write an agent that automates work with meta tags in SEO.
Let's make the simplest, most functional AI Agent
What you will need and what you should consider before starting
It took me a lot of effort to come up with something that would be both easy to implement and even useful. This is what we'll be writing: an AI Agent for identifying keywords for search engine optimization based on the description of an article's idea, based on the top 20 Google search results.
It's not complicated; we'll need the following components:
- One Google model - gemini-2.5-flash.
- A search results parsing tool - Tavily.
- And the language we'll be writing it in - Python/LangChain.
Why did I choose this particular stack for my agent? Well, it's all pretty straightforward: they have free limits you can practice on and get the hang of. And, in general, the limits they offer are more than enough for personal use. And getting API keys for them is very easy, a matter of minutes.
Obtaining an API key to use Gemini
Visit Google's official website to create, integrate, and configure your own AI Chats: https://aistudio.google.com/app/api-keys . You'll need a Google account, of course. On the dashboard:

Click Create API Key, then enter a name for the key and select "Default Gemini Project." The API key is ready; copy or save it; we'll need it later.

Obtain an API key to use Tavily
Now let's get an API key for Tavily. We'll need it to get search results not only from Google but also from other search engines.
To do this, first, you need to register and go to the home page. Then, create and copy your API key.

Tavily API Key Creation Page
AI agent code
Now we have everything we need to create our first AI Agent. Let's create a separate directory, a virtual environment, install the necessary libraries, and create several files (the agent configuration file and the agent file itself).
Create a separate directory, a virtual environment, and activate it:
Let's install the necessary packages:
Create two files in the same directory: settings.json and main.py. In the first one, place your API keys like this:
The second file will contain the actual logic of our agent. Here's its code; more details are below.
The external tool in this case is the get_search_results function. The workflow control system is the langchain package itself, which is given a model (gemini-2.5-flash) as input, and only gets_search_results as tools, along with the communication context setup.
Next, we invoke the agent and provide it with even more context for working with the input data. Note that I haven't written anything about how or what tools it should use. But I have specified the exact result I expect.
Then I structure it and save it to a file.
AI Agent Code, Enhanced
Our agent works, but plain text output isn't always convenient for programmatic processing. Furthermore, we're using a third-party search engine when the model itself has powerful built-in capabilities. Let's refactor it and make the agent more professional and autonomous.
This can be done in the following way:
- First, use Gemini's built-in search tool.
- Second, use structured inference.
Let's look at the improved version, and below I'll explain step-by-step what we changed and why.
As you can see, the code has become a bit cleaner and more understandable. We've also removed our tool and now use Gemini's built-in Google search (I discussed built-in tools earlier in the chapter above).
Also, when calling the agent, we no longer specify how to format and return our data. Now, we specify the format of the returned data in a specially defined class — KeywordsInfo.
You can find more information about how to specify structured results for agents here. Roughly speaking, when invoking an agent, we specify what we want to get from it, and in the KeywordsInfo class, we refine and structure it accordingly.
We create an agent with the appropriate specification of the results we want to get:
Afterwards, we launch the agent:
We receive the data and do what we need with it. In my case, I simply save it to a JSON file. Perhaps I'll then send it to the server or pass it on to the next agent for further tasks.
This was an example of a very simple AI agent, using the built-in search tool and implementing our own tool. With and without a structured response.
Conclusions, or why generative AI is good for automating "some" processes
That's what AI Agents are. They're not particularly difficult to write, and they're only as cool as the tools and models that drive them.
I realize that in this article, I didn't cover important topics like testing, error handling, working with files (or artifacts), and deploying autonomous agents to servers. But that wasn't necessary; in this article, I wanted to show that creating your own AI Agent is even easier than writing a Telegram bot. And that it can be very useful when paired with certain tools.
This article is just the first, introductory part of working with AI Agents. More and more interesting things will follow. To stay up to date with future articles, subscribe to the corresponding RSS feed or email newsletter, and don't forget to leave a comment if I missed anything or made a mistake. Bye.