Meta tags

Meta Tag Content
description Learn web scraping with JavaScript and NodeJS with this step-by-step tutorial. We will see the different ways to scrape the web in JavaScript through lots of example.
viewport width=device-width
Website Page URL https://www.scrapingbee.com/blog/web-scraping-javascript/

Heading tags

h1 tag

We found around "1" h1 tags which are found in this page url and are available in the table below.

S.no h1 tag content
1 Web Scraping with JavaScript and NodeJS

h2 tag

We found around "7" h2 tags which are found in this page url and are available in the table below.

S.no h2 tag content
1 Understanding NodeJS: A brief introduction
2 HTTP clients: querying the web
3 Data Extraction in JavaScript
4 Headless Browsers in JavaScript
5 Summary
6 Resources
7 Tired of getting blocked while scraping the web?

h3 tag

We found around "15" h3 tags which are found in this page url and are available in the table below.

S.no h3 tag content
1 Prerequisites
2 Outcomes
3 The JavaScript Event Loop
4 1. Built-In HTTP Client
5 2. Fetch API
6 3. Axios
7 4. SuperAgent
8 5. Request
9 Regular expressions: the hard way
10 Cheerio: Core jQuery for traversing the DOM
11 jsdom: the DOM for Node
12 1. Puppeteer: the headless browser
13 2. Nightmare: an alternative to Puppeteer
14 3. Playwright, the new web scraping framework
15 You might also like:

h4 tag

We found around "11" h4 tags which are found in this page url and are available in the table below.

S.no h4 tag content
1 SuperAgent plugins
2 Using the Cheerio NPM Package for Web Scraping
3 Infinite Scroll with Puppeteer
4 Block ressources with Puppeteer
5 Company
6 Tools
7 Legal
8 Product
9 How we compare
10 No code web scraping
11 Learning Web Scraping

h5 tag

Unfortunately we were not able to find any h3 tag in the URL of this page.

h6 tag

Unfortunately we were not able to find any h3 tag in the URL of this page.

HTML Formatting Elements - Important text (strong/bold) tags

S.no Tag content
1 Kevin Sahin | 02 August 2022 (updated) | 23 min read
2 Ryan Dahl introduced NodeJS in 2009
3 But enough of theory, let's check it out, shall we?
4 Not bad, two lines of code, no manual handling of data, no distinction between HTTP and HTTPS, and a native JSON object.
5 Should you use Request?
6 Proceed with caution please.
7 really
8 NodeJS
9 non-blocking
10 HTTP clients
11 Cheerio
12 JSDOM
13 Puppeteer
14 Nightmare
15 Check it out please.
16 Kevin Sahin
17 Ben Force
18 Maxine Meurer
19 Shadid Haque

HTML Formatting Elements - Important text (i) tags

S.no Tag content

HTML Formatting Elements - Underline text (u) tags

S.no Tag content

HTML Formatting Elements - Code tags

S.no Tag content
1
document
2
window
3
const http = require('http');
const PORT = 3000;

const server = http.createServer((req, res) => {
  res.statusCode = 200;
  res.setHeader('Content-Type', 'text/plain');
  res.end('Hello World');
});

server.listen(port, () => {
  console.log(`Server running at PORT:${port}/`);
});
4
require
5
createServer
6
listen
7
accept
8
while (true);
9
MyServer.js
10
node MyServer.js
11
const http = require('http');

const req = http.request('http://example.com', res => {
	const data = [];

	res.on('data', _ => data.push(_))
	res.on('end', () => console.log(data.join()))
});

req.end();
12
fetch()
13
async function fetch_demo()
{
	const resp = await fetch('https://www.reddit.com/r/programming.json');

	console.log(await resp.json());
}

fetch_demo();
14
await
15
json()
16
fetch
17
POST
18
npm install axios
19
const axios = require('axios')

axios
	.get('https://www.reddit.com/r/programming.json')
	.then((response) => {
		console.log(response)
	})
	.catch((error) => {
		console.error(error)
	});
20
async function getForum() {
	try {
		const response = await axios.get(
			'https://www.reddit.com/r/programming.json'
		)
		console.log(response)
	} catch (error) {
		console.error(error)
	}
}
21
getForum
22
const superagent = require("superagent")
const forumURL = "https://www.reddit.com/r/programming.json"

// callbacks
superagent
	.get(forumURL)
	.end((error, response) => {
		console.log(response)
	})

// promises
superagent
	.get(forumURL)
	.then((response) => {
		console.log(response)
	})
	.catch((error) => {
		console.error(error)
	})

// promises with async/await
async function getForum() {
	try {
		const response = await superagent.get(forumURL)
		console.log(response)
	} catch (error) {
		console.error(error)
	}
}
23
npm install superagent
24
const request = require('request')
request('https://www.reddit.com/r/programming.json', function (
  error,
  response,
  body
) {
  console.error('error:', error)
  console.log('body:', body)
})
25
npm install request
26
const htmlString = '<label>Username: John Doe</label>'
const result = htmlString.match(/<label>Username: (.+)<\/label>/)

console.log(result[1])
// John Doe
27
String.match()
28
(.+)
29
result[1]
30
<label>
31
const cheerio = require('cheerio')
const $ = cheerio.load('<h2 class="title">Hello world</h2>')

$('h2.title').text('Hello there!')
$('h2').addClass('welcome')

$.html()
// <h2 class="title welcome">Hello there!</h2>
32
onClick
33
npm install cheerio axios
34
crawler.js
35
const axios = require('axios');
const cheerio = require('cheerio');

const getPostTitles = async () => {
	try {
		const { data } = await axios.get(
			'https://old.reddit.com/r/programming/'
		);
		const $ = cheerio.load(data);
		const postTitles = [];

		$('div > p.title > a').each((_idx, el) => {
			const postTitle = $(el).text()
			postTitles.push(postTitle)
		});

		return postTitles;
	} catch (error) {
		throw error;
	}
};

getPostTitles()
    .then((postTitles) => console.log(postTitles));
36
getPostTitles()
37
cheerio.load()
38
$
39
Inspect
40
div > p.title > a
41
$('div > p.title > a')
42
each()
43
text()
44
node crawler.js
45
const { JSDOM } = require('jsdom')
const { document } = new JSDOM(
	'<h2 class="title">Hello world</h2>'
).window

const heading = document.querySelector('.title')
heading.textContent = 'Hello there!'
heading.classList.add('welcome')

heading.innerHTML
// <h2 class="title welcome">Hello there!</h2>
46
querySelector()
47
<div>
48
const { JSDOM } = require("jsdom")

const HTML = `
	<html>
		<body>
			<button onclick="const e = document.createElement('div'); e.id = 'myid'; this.parentNode.appendChild(e);">Click me</button>
		</body>
	</html>`;

const dom = new JSDOM(HTML, {
	runScripts: "dangerously",
	resources: "usable"
});

const document = dom.window.document;

const button = document.querySelector('button');

console.log("Element before click: " + document.querySelector('div#myid'));
button.click();
console.log("Element after click: " + document.querySelector('div#myid'));
49
require()
50
HTML
51
runScripts
52
click()
53
Element before click: null
Element after click: [object HTMLDivElement]
54
resources
55
npm install puppeteer
56
PUPPETEER_SKIP_CHROMIUM_DOWNLOAD
57
const puppeteer = require('puppeteer')

async function getVisual() {
	try {
		const URL = 'https://www.reddit.com/r/programming/'
		const browser = await puppeteer.launch()

		const page = await browser.newPage()
		await page.goto(URL)

		await page.screenshot({ path: 'screenshot.png' })
		await page.pdf({ path: 'page.pdf' })

		await browser.close()
	} catch (error) {
		console.error(error)
	}
}

getVisual()
58
getVisual()
59
puppeteer.launch()
60
newPage()
61
goto()
62
goto
63
screenshot()
64
pdf()
65
close()
66
npm install nightmare
67
const Nightmare = require('nightmare')
const nightmare = Nightmare()

nightmare
	.goto('https://search.brave.com/')
	.type('#searchbox', 'ScrapingBee')
	.click('#submit-button')
	.wait('#results a')
	.evaluate(
		() => document.querySelector('#results a').href
	)
	.end()
	.then((link) => {
		console.log('ScrapingBee Web Link:', link)
	})
	.catch((error) => {
		console.error('Search failed:', error)
	})
68
nightmare
69
type
70
#searchbox
71
click
72
#submit-button
73
wait
74
evaluate()
75
<a>
76
href
77
end()
78
ScrapingBee Web Link: https://www.scrapingbee.com/
79
const playwright = require('playwright');
async function main() {
    const browser = await playwright.chromium.launch({
        headless: false // setting this to true will not run the UI
    });

    const page = await browser.newPage();
    await page.goto('https://finance.yahoo.com/world-indices');
    await page.waitForTimeout(5000); // wait for 5 seconds
    await browser.close();
}

main();

The Anchor element (a) tags

S.no Anchor tag Content
1 Login
2 Sign Up
3 Pricing
4 FAQ
5 Blog
6 Other Features
7 Screenshots
8 Google search API
9 Data extraction
10 JavaScript scenario
11 No code web scraping
12 Developers
13 Tutorials
14 Documentation
15 Knowledge Base
16 Try ScrapingBee for Free
17 Event Loop
18 callback functions
19 http://localhost:3000
20 built-in HTTP client
21 separate library for HTTPS URLs
22 Fetch API
23 version 18
24 node-fetch
25 article on node-fetch
26 Promises
27 await
28 json() function
29 Response object
30 options argument
31 Github
32 GitHub
33 plugins
34 superagent-throttle
35 Request
36 wrapper libraries
37 String.match()
38 Cheerio
39 Single-page applications
40 Headless Browsers in JavaScript
41 r/programming
42 knowledge on XPath
43 NodeJS Axios proxy
44 jsdom
45 querySelector()
46 jsdom's documentation
47 here
48 SPAs
49 Puppeteer
50 Source
51 Puppeteer environment variables
52 How to download a file with Puppeteer
53 Handling and submitting HTML forms with Puppeteer
54 Using Puppeteer with Python and Pyppeteer
55 Nightmare
56 https://search.brave.com
57 type
58 click
59 wait
60 evaluate()
61 end()
62 https://www.scrapingbee.com
63 Playwright tutorial
64 guide on how not to get blocked as a crawler
65 scraping API platform
66 NodeJS Website
67 Puppeteer's Docs
68 Playright
69 ScrapingBee's Blog
70 Handling infinite scroll with Puppeteer
71 Node-unblocker
72 A Javascript developer's guide to cURL
73 ScrapingBee
74 Using the Cheerio NPM Package for Web Scraping Ben Force 9 min read In this article, you'll learn how to use Cheerio to scrape data from static HTML content.
75 Infinite Scroll with Puppeteer Maxine Meurer 10 min read Infinite page are everywhere. This article will teach you to scroll infinite pages with Puppeteer. We will also see the alternative methods for scraping infinite pages.
76 Block ressources with Puppeteer Shadid Haque 5 min read This article will show you how to intercept and block requests with Puppeteer using the request interception API and the puppeteer extra plugin.
77 Team
78 Company's journey
79 Rebranding
80 Affiliate Program
81 Curl converter
82 Terms of Service
83 Privacy Policy
84 GDPR Compliance
85 Data Processing Agreement
86 Features
87 Status
88 Alternative to Crawlera
89 Alternative to Luminati
90 Alternative to Smartproxy
91 Alternative to NetNut
92 Alternative to ScraperAPI
93 Alternatives to ScrapingBee
94 No code competitor monitoring
95 How to put scraped website data into Google Sheets
96 Send stock prices update to Slack
97 Scrape Amazon products' price with no code
98 Extract job listings, details and salaries
99 Web scraping questions
100 A guide to Web Scraping without getting blocked
101 Web Scraping Tools
102 Best Free Proxies
103 Best Mobile proxies
104 Web Scraping vs Web Crawling
105 Rotating and residential proxies
106 Web Scraping with Python
107 Web Scraping with PHP
108 Web Scraping with Java
109 Web Scraping with Ruby
110 Web Scraping with NodeJS
111 Web Scraping with R
112 Web Scraping with C#
113 Web Scraping with C++
114 Web Scraping with Elixir
115 Web Scraping with Perl
116 Web Scraping with Rust
117 Web Scraping with Go

Contact Us

If you have any inquiries or feedback, please don't hesitate to reach out to us at [email protected]. We will respond to your request as soon as possible. Thank you very much for your interest!

Country profiles