We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Design a Basic Search Engine (Google or Bing) | System Design Interview Prep
Summary
Description
Visit Our Website: https://interviewpen.com/?utm_campaign=google
Join Our Discord (24/7 help): https://discord.gg/Qy85PT9wj6
Join Our Newsletter - The Blueprint: https://theblueprint.dev/subscribe
Like & Subscribe: https://youtube.com/@interviewpen
This is an example of a full video available on interviewpen.com. Check out our website to find more premium content like this!
Problem Statement:
Provide a design overview of a basic search engine. Your search engine system must support the following:
- **Retrieval:** The search engine should display a list of relevant web pages in response to a user query. The results should include the page title, URL, and a brief summary.
- **Indexing:** The system should be able to crawl and index web pages from the Internet. The indexing process should store metadata about the web pages, such as their URL, title, and a brief summary.
- **Scalability:** The system should be designed to handle a large number of queries and indexed web pages, ensuring that response times remain low as the search engine scales.
Finer concerns such as query processing & page ranking can be briefly addressed, but are not mandatory.
Table of Contents:
0:00 - Requirements
0:20 - How Search Works
1:57 - API: Accepting Search Queries
2:16 - Database: Storing Site Metadata
4:19 - Database Demands
4:51 - Page BLOB Store
5:17 - Database Sharding
6:10 - Global Index
6:33 - Text Index
7:09 - The System Thus Far
7:52 - Crawling
9:06 - robots.txt Cache
9:24 - Crawler Demands
10:31 - The System So Far
11:04 - URL Frontier: Priority
11:39 - URL Frontier: Politeness
12:01 - Naive URL Frontier
12:31 - Multiple Queues
13:35 - Solving for Politeness
15:51 - URL Frontier: Recap
16:16 - URL Frontier Demands
17:24 - Full Design Review
17:49 - Extensions
19:10 - Visit interviewpen.com
Socials:
Twitter: https://twitter.com/InterviewPen
Twitter (The Blueprint): https://twitter.com/theblueprintdev
Translated At: 2025-04-12T07:28:50Z