Introduction/continuation of the semantic core where...
I will show what clustering is, why we will do this and what is most important, how it is done manually and how this can be automated using third-party services. Before answering the question why, you need to answer the question of what. Namely, what is the clustering of search queries?
You can cluster in two ways:
- The first method uses the lexical similarity of grouped keywords.
- The second way, relying on the user's intent, group queries according to the similarity of search results. This method is divided into two subcategories:
- Soft clustering. It is enough that the pages in the search match only partially.
- Hard clustering. It is necessary that the pages in the search match completely.
In general, a third type of clustering is also distinguished, namely, a logical one. There is an emphasis on logical combination of queries. For example, group all queries into a "Purchases" or "Services" or "Products" cluster, etc. But as I will show further, the logical and lexical clustering is almost no different from each other. And about it further.
Lexical clustering of keywords
There is nothing to say about the first way. For example, there is such a table of keywords:
| Telegram Bot Menu | 500 |
| Webhook Telegram Bot | 500 |
| YouTube MP3 Telegram | 500 |
| Telegram SetWebhook | 500 |
| Telegram Bot Menu Button | 500 |
| YouTube Converter MP3 Telegram | 500 |
| Keyword | Impressions |
|---|
Now, in order to cluster it according to the first method, it is necessary to group them according to the presence of common words in each. So, for example, the table above will be grouped like this:
| Telegram Bot Menu | Telegram Bot Menu | 500 |
| Telegram Bot Menu Button | Telegram Bot Menu | 500 |
| Webhook Telegram Bot | Webhook Telegram Bot | 500 |
| Telegram SetWebhook | Webhook Telegram Bot | 500 |
| YouTube MP3 Telegram | YouTube MP3 Telegram | 500 |
| YouTube Converter MP3 Telegram | YouTube MP3 Telegram | 500 |
| Keyword | Cluster | Shows |
|---|
This is a fairly simple way of clustering that requires basic knowledge of the language, nothing more.
Logical clustering works on the principle of combining keywords according to their logical similarity. So, if there are many queries for the development of telegram bots, it is probably worth combining them into one cluster and calling it services for the development of telegram bots.
In practice, it is best to combine these two methods. If the first one is easily automated, then the second will require your intervention. And in fact, this is just one method.
Clustering by top pages in SERP
But while being simple, it is not the one I would recommend to use. Why? This method cannot show and capture the user's intent into clusters. Or rather, it will not be able to display the user's intent, assumed by the search engine, i.e. Google/Yandex.
Search engines themselves do not know the search query intent, but thanks to algorithms, they display exactly those pages that best respond to the user's hidden intention that entered the search query.
And in order to "grab" and catch the user's intent, it is necessary to use the second method, that is, to engage in clustering by top pages in SERP, more about it later.
How does it work? Let's say we have the following queries in the semantic core(we will use the previously mentioned queries):
- Webhook Telegram Bot
- Telegram SetWebhook
- YouTube MP3 Telegram
Now you need to enter these queries into the target search engine in order to find out what is ranked according to these queries (then I will give a table with the results of the analysis of search results from Google).
| https://stackoverflow.com/questions/42554548/how-to-set-telegram-bot-webhook | https://stackoverflow.com/questions/36905455/how-to-use-setwebhook-in-telegram | https://t.me/ytbaudiobot |
| https://habr.com/en/companies/digitalleague/articles/716760/ | https://telegram-bot-sdk.readme.io/reference/setwebhook | https://telegram.me/convert_youtube_to_mp3_bot |
| https://core.telegram.org/bots/webhooks | https://core.telegram.org/bots/webhooks | https://www.telegrambots.info/bots/ytbaudiobot |
| https://timeweb.cloud/tutorials/nodejs/otlichie-polling-i-webhook-v-telegram-botah | https://decovar.dev/blog/2018/12/02/telegram-bot-webhook-en/ | https://ichip.ru/podborki/programmy-prilozheniya/5-telegram-b . otov-kotorye-pomogut-skachat-video-s-youtube-i-socsetej-834852 |
| https://core.telegram.org/bots/api | https://core.telegram.org/bots/api | https://lifehacker.ru/download-audio-youtube/ |
| Top URLs for "Webhook Telegram Bot" | Top URLs for "Telegram SetWebhook" | Top URLs for "Youtube MP3 Telegram" |
|---|
The second request, although similar to the first one, shows a slightly different output. Although this is easily explained, because if the first request is quite wide in its own sense, then the second one is much more specific because it is about a certain method of installing webhooks. And in the second table there are only two similar addresses.
The third request is completely different, not a single similar address.
On the example of a table, it is easy to explain the difference between hard clustering and soft. If we used soft clustering with a similarity of 2 (that is, how many addresses should match), then the first and second request would fall into one cluster.
But if you use hard clustering, none of the queries would get into the common cluster, because for hard clustering it is necessary that all addresses in the search result match.
Large core clustering
Now, when we know what clustering is, what its types are, and why it is used, you can wonder, "But what if I have a semantic core for 1000 or more queries/keywords?"
You have three ways:
- Find relevant online tools and pay with money
- Make your own counterparts and pay with time
- Manually sort through all the keywords and again pay with time
It may seem like the third option is not an option at all. But if you have a semantic core of up to 50 keywords, or if you already have a clustered core and you are in the process of using it, then you can gradually cluster it in this way.
If you decide to spend your time and create a semantic core clusterer, then you will need to implement 2 interconnected tools:
- Search engine scraper. I have an article on my website how to make such a parser for google. And also ready python script for this.
- The clusterer itself, which would compare the resulting list of addresses. Maybe someday I'll do it myself.
And if you have a little money, you can use ready-made solutions. I can note only one single that completely covers my clustering needs and this is clusterizer from Arsenkin.

I recommend it; I use it myself. Works on the principle of paid limits. The cheapest option is 28k limits for 30 days for $8. For this number of limits, you can cluster approximately 15k keywords with frequency.
If you enter a list of keywords at the input from this table(which was obtained in the previous article), we will get this table as an output. Well, how to use a clustered core is probably the next question that will arise in your head. And I will try to answer it in the next chapter.
In which I will explain how to use it
Many articles describe the process and ways to create a clustered semantic core, but never how to use it at all. Everyone talks about its benefits without showing any benefit in practice. I want to fix this by showing a previously clustered core.
There are 9 columns in the table, we will go through each of them:
- Search queries - the key word
- Cluster - the general name of the cluster, given for the first search query in the group
- Queries in cluster - number of keywords in the cluster
- Impressions - Number of impressions per month
- Cluster's impressions sum - total number of impressions per month of the entire cluster
- % Aggregators - how many pages of aggregators in search results
- Main pages - How many main pages in search results
- Toponym in the query - is there a toponym in the request (place name)
- URLs - search results on request
I usually add two more columns. One for writing the addresses of the pages of my site that use a particular cluster. I do this in order to follow the progress of the "implementation" of the semantic core to the site. Another to indicate the type of cluster, and I do this in order to understand what I should use this cluster for.
Once we have a clustered core, we need to look through all of its clusters and decide what to use them for. So I, for example, divide clusters by destination pages. They are as follows:
| A | Article | Some informative material, guide, tutorial, case. | 10-100 |
| BA | Big article | It's just a very large article that combines ordinary articles. Such pages are also called pillar page | 100-1000 |
| T | Tool | A simple terminal tool, or in our case it will be a telegram bot | 10-100 |
| BT | Big tool | By a large tool, I mean not only the complexity of its execution, but also that it will be online, that is, it will be available to everyone | 100-1000 |
| PP** | Pagination page | A page that group either articles or tools. | 1000-10000 |
| Abbreviation | Page type | Page description | Number of impressions* |
|---|
* - here the amount is conditional and depends specifically on the core itself and the competition that occurs between sites for issuing in the search.
** - these are pages, pagination pages, it makes sense to optimize if you can implement selective indexing of such pages, otherwise thousands and thousands of such pages will eat up your site budget.
How do you determine what and which cluster is best? Well, if you understand the topic and are a specialist in the area in which you make your site, then a quick glance should be enough to understand which cluster and which type of page it is better to combine. But what if not?
I can give only a few recommendations:
- Look at the user's intent. If you can see in the query that a person is looking for, let's say free telegram bots, then the best page for such a cluster would be either a pillar page or pagination page.
- You see a huge cluster under 20-40 keywords, it may be worth using it for a pillar page and break it into several small articles.
- Do not try to insert all keywords on your page. Choose only a few and after publication and for some time, change them to find the optimal phrases and combinations of them.
When you're done, you'll get something like such a table. Which you will later use as a compass in choosing the next material for the article. and which will help you follow what you have already done and what else you have to do. Just start with the cluster you like and start writing. But keep in mind that...
It is better to start with low-competitive queries and where you easily lift up to TOP 10, TOP 5 in search results. How to determine that the cluster is low-competitive? To do this, look at the column F(% Aggregators) and G(Main pages).
The higher the percentage of aggregating pages, the easier it is to advance there. Why? They are aggregators, they do not bring anything new to the search, but simply copy the content of others, so they do not have or have very little weight in the search eyes.
The smaller the number of main pages in the search, the easier it is to move there. Why? It's just that someone made a very, very highly specialized site and decided with a 100% chance to take all the impressions for this request. It's hard to get around these, if it's real at all.
Conclusion
Of course, there are other online clusterers like PixelPlus, Serpstat, Topvisor, but I didn't really like their output work, and this article is not about the top clusterers, but about a process as a whole.
The use of clustering is obvious, so if you need to understand what keywords could be used in a particular article, structure the site, understand the direction of the site development, or find out what keywords to promote advertising in search, then clustering is necessary.
And although it will not be difficult to cluster the core, it will definitely be difficult to use it. After all, there is an opinion that a "semantic core", a kind of fiction of SEO companies and SEO specialists, to make the appearance of what they work on and raise their price.
Interesting point of view, but as we saw from the previous chapter about its use, the semantic core is much more useful than many can imagine. And do not perceive the semantic core as a result of work as such, but as a tool to help achieve specific results (clicks, impressions, conversions).
How do you use the semantic core? And do you use it at all? I will be glad to read about it in the comments.