3 horizontal lines, burger
3 horizontal lines, burger

3 horizontal lines, burger
Remove all
LOADING ...

An online scraper of text, headers and other from websites and their pages | TextThief

Clock
13.05.2025
/
Clock
21.05.2025
An eye
610
Hearts
0
Connected dots
0
Connected dots
0
Connected dots
0
Web tool
Web tool
Django app
Django app
Terminal user interface
Terminal user interface
Scraper
Scraper

CSS selector
Rules
  • Separate URLs using space
  • The top limit of URLs to use while crawling via List of URLs is 100
  • Parsing results will be stored only until 00:00 the following day, after successful parsing.
LOADING ...

About a scraper of text

Online tool to scrape text, headers and source code (just use CSS selector) from websites, web pages and lists of pages. With subsequent basic processing, which includes the number of words, the number of unique words and collecting a list of the frequency of occurrence of these words in the text.
This tool works in 3 modes. Parsing mode from one page, from a list of pages and from the entire site.
This web page text parser is also a webimplementation of the text-thief python library. Which provides general functionality for working with text. There is also an implementation in the form of a command line tool, which is much easier to understand and study. This library is available via PiPI, or you can install its sources directly from here.

Similar tools

Clock
27.11.2023
/
Clock
21.05.2025
/
Web tool
Django app
Terminal user interface
Scraper
An eye
3376
Hearts
2
Connected dots
1
Connected dots
0
Connected dots
4
An online web tool to scrape all images from a whole website is either a list of pages or one single page. With the option to use this tool as a Python script on your own computer, Using this tool as a Django app. And, for sure, as a usual online tool, which will always be accessible for you.
Clock
04.05.2025
/
Clock
21.05.2025
/
Web tool
Django app
Telegram bot
With graphical interface
Terminal user interface
Scraper
An eye
741
Hearts
0
Connected dots
0
Connected dots
0
Connected dots
0
This tool is a web version and skin for my library for parsing links from websites. This library has several more skins, such as a CLI script, a GUI application, a Telegram bot and as a regular python library (link-thief) available through PyPI.

Do not forget to share, like and leave a comment :)

Reviews

(0)

captcha
Send
LOADING ...
It's empty now. Be the first (o゚v゚)ノ