Convert a webpage to an image or pdf using headless Chrome
The package can convert a webpage to an image or pdf. The conversion is done behind the scenes by Puppeteer which controls a headless version of Google Chrome.
Here's a quick example:
use Spatie\Browsershot\Browsershot;
// an image will be saved
Browsershot::url('https://example.com')->save($pathToImage);
It will save a pdf if the path passed to the save
method has a pdf
extension.
// a pdf will be saved
Browsershot::url('https://example.com')->save('example.pdf');
You can also use an arbitrary html input, simply replace the url
method with html
:
Browsershot::html('<h1>Hello world!!</h1>')->save('example.pdf');
Browsershot also can get the body of an html page after JavaScript has been executed:
Browsershot::url('https://example.com')->bodyHtml(); // returns the html of the body
If you wish to retrieve an array list with all of the requests that the page triggered you can do so:
$requests = Browsershot::url('https://example.com')
->triggeredRequests();
foreach ($requests as $request) {
$url = $request['url']; //https://example.com/
}
triggeredRequests()
works well with waitUntilNetworkIdle
as described here
Support us
We invest a lot of resources into creating best in class open source packages. You can support us by buying one of our paid products.
We highly appreciate you sending us a postcard from your hometown, mentioning which of our package(s) you are using. You'll find our address on our contact page. We publish all received postcards on our virtual postcard wall.
Requirements
This package requires node 7.6.0 or higher and the Puppeteer Node library.
On MacOS you can install Puppeteer in your project via NPM:
npm install puppeteer
Or you could opt to just install it globally
npm install puppeteer --global
On a Forge provisioned Ubuntu 16.04 server you can install the latest stable version of Chrome like this:
curl -sL https://deb.nodesource.com/setup_12.x | sudo -E bash -
sudo apt-get install -y nodejs gconf-service libasound2 libatk1.0-0 libc6 libcairo2 libcups2 libdbus-1-3 libexpat1 libfontconfig1 libgbm1 libgcc1 libgconf-2-4 libgdk-pixbuf2.0-0 libglib2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libpangocairo-1.0-0 libstdc++6 libx11-6 libx11-xcb1 libxcb1 libxcomposite1 libxcursor1 libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 libxrender1 libxss1 libxtst6 ca-certificates fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils wget libgbm-dev
sudo npm install --global --unsafe-perm puppeteer
sudo chmod -R o+rx /usr/lib/node_modules/puppeteer/.local-chromium
Custom node and npm binaries
Depending on your setup, node or npm might be not directly available to Browsershot. If you need to manually set these binary paths, you can do this by calling the setNodeBinary
and setNpmBinary
method.
Browsershot::html('Foo')
->setNodeBinary('/usr/local/bin/node')
->setNpmBinary('/usr/local/bin/npm');
By default, Browsershot will use node
and npm
to execute commands.
Custom include path
If you don't want to manually specify binary paths, but rather modify the include path in general, you can set it using the setIncludePath
method.
Browsershot::html('Foo')
->setIncludePath('$PATH:/usr/local/bin')
Setting the include path can be useful in cases where node
and npm
can not be found automatically.
Custom node module path
If you want to use an alternative node_modules
source you can set it using the setNodeModulePath
method.
Browsershot::html('Foo')
->setNodeModulePath("/path/to/my/project/node_modules/")
Custom binary path
If you want to use an alternative script source you can set it using the setBinPath
method.
Browsershot::html('Foo')
->setBinPath("/path/to/my/project/my_script.js")
Custom chrome/chromium executable path
If you want to use an alternative chrome or chromium executable from what is installed by puppeteer you can set it using the setChromePath
method.
Browsershot::html('Foo')
->setChromePath("/path/to/my/chrome")
Pass custom arguments to Chromium
If you need to pass custom arguments to Chromium, use the addChromiumArguments
method.
The method accepts an array
of key/value pairs, or simply values. All of these arguments will automatically be prefixed with --
.
Browsershot::html('Foo')
->addChromiumArguments([
'some-argument-without-a-value',
'keyed-argument' => 'argument-value',
]);
If no key is provided, then the argument is passed through as-is.
Example array | Flags that will be passed to Chromium |
---|---|
['foo'] |
--foo |
['foo', 'bar'] |
--foo --bar |
['foo', 'bar' => 'baz' ] |
--foo --bar=baz |
This method can be useful in order to pass a flag to fix font rendering issues on some Linux distributions (e.g. CentOS).
Browsershot::html('Foo')
->addChromiumArguments([
'font-render-hinting' => 'none',
]);
Installation
This package can be installed through Composer.
composer require spatie/browsershot
Usage
In all examples it is assumed that you imported this namespace at the top of your file
use Spatie\Browsershot\Browsershot;
Screenshots
Here's the easiest way to create an image of a webpage:
Browsershot::url('https://example.com')->save($pathToImage);
Formatting the image
By default the screenshot's type will be a png
. (According to Puppeteer's Config) But you can change it to jpeg
with quality option.
Browsershot::url('https://example.com')
->setScreenshotType('jpeg', 100)
->save($pathToImage);
Sizing the image
By default the screenshot's size will match the resolution you use for your desktop. Want another size of screenshot? No problem!
Browsershot::url('https://example.com')
->windowSize(640, 480)
->save($pathToImage);
You can also set the size of the output image independently of the size of window. Here's how to resize a screenshot take with a resolution of 1920x1080 and scale that down to something that fits inside 200x200.
Browsershot::url('https://example.com')
->windowSize(1920, 1080)
->fit(Manipulations::FIT_CONTAIN, 200, 200)
->save($pathToImage);
You can screenshot only a portion of the page by using clip
.
Browsershot::url('https://example.com')
->clip($x, $y, $width, $height)
->save($pathToImage);
You can take a screenshot of an element matching a selector using select
and an optional $selectorIndex
which is used to select the nth element (e.g. use $selectorIndex = 3
to get the fourth element like div:eq(3)
). By default $selectorIndex
is 0
which represents the first matching element.
Browsershot::url('https://example.com')
->select('.some-selector', $selectorIndex)
->save($pathToImage);
Getting a screenshot as base64
If you need the base64 version of a screenshot you can use the base64Screenshot
method. This can come in handy when you don't want to save the screenshot on disk.
$base64Data = Browsershot::url('https://example.com')
->base64Screenshot();
Manipulating the image
You can use all the methods spatie/image provides. Here's an example where we create a greyscale image:
Browsershot::url('https://example.com')
->windowSize(640, 480)
->greyscale()
->save($pathToImage);
Taking a full page screenshot
You can take a screenshot of the full length of the page by using fullPage()
.
Browsershot::url('https://example.com')
->fullPage()
->save($pathToImage);
Setting the device scale
You can also capture the webpage at higher pixel densities by passing a device scale factor value of 2 or 3. This mimics how the webpage would be displayed on a retina/xhdpi display.
Browsershot::url('https://example.com')
->deviceScaleFactor(2)
->save($pathToImage);
Mobile emulation
You can emulate a mobile view with the mobile
and touch
methods. mobile
will set the display to take into account the page's meta viewport, as Chrome mobile would. touch
will set the browser to emulate touch functionality, hence allowing spoofing for pages that check for touch. Along with the userAgent
method, these can be used to effectively take a mobile screenshot of the page.
Browsershot::url('https://example.com')
->userAgent('My Mobile Browser 1.0')
->mobile()
->touch()
->save($pathToImage);
Device emulation
You can emulate a device view with the device
method. The devices' names can be found Here.
$browsershot = new Browsershot('https://example.com', true);
$browsershot
->device('iPhone X')
->save($pathToImage);
is the same as
Browsershot::url('https://example.com')
->userAgent('Mozilla/5.0 (iPhone; CPU iPhone OS 11_0 like Mac OS X) AppleWebKit/604.1.38 (KHTML, like Gecko) Version/11.0 Mobile/15A372 Safari/604.1')
->windowSize(375, 812)
->deviceScaleFactor(3)
->mobile()
->touch()
->landscape(false)
->save($pathToImage);
Backgrounds
If you want to ignore the website's background when capturing a screenshot, use the hideBackground()
method.
Browsershot::url('https://example.com')
->hideBackground()
->save($pathToImage);
Dismiss dialogs
Javascript pop ups such as alerts, prompts and confirmations cause rendering of the site to stop, which leads to an empty screenshot. Calling dismissDialogs()
method automatically closes such popups allowing the screenshot to be taken.
Browsershot::url('https://example.com')
->dismissDialogs()
->save($pathToImage);
Disable Javascript
If you want to completely disable javascript when capturing the page, use the disableJavascript()
method. Be aware that some sites will not render correctly without javascript.
Browsershot::url('https://example.com')
->disableJavascript()
->save($pathToImage);
Disable Images
You can completely remove all images and elements when capturing a page using the disableImages()
method.
Browsershot::url('https://example.com')
->disableImages()
->save($pathToImage);
Block Urls
You can completely block connections to specific Urls using the blockUrls()
method. Useful to block advertisements and trackers to make screenshot creation faster.
$urlsList = array("example.com/cm-notify?pi=outbrain", "sync.outbrain.com/cookie-sync?p=bidswitch");
Browsershot::url('https://example.com')
->blockUrls($urlsList)
->save($pathToImage);
Block Domains
You can completely block connections to specific domains using the blockDomains()
method. Useful to block advertisements and trackers to make screenshot creation faster.
$domainsList = array("googletagmanager.com", "googlesyndication.com", "doubleclick.net", "google-analytics.com");
Browsershot::url('https://example.com')
->blockDomains($domainsList)
->save($pathToImage);
Waiting for lazy-loaded resources
Some websites lazy-load additional resources via ajax or use webfonts, which might not be loaded in time for the screenshot. Using the waitUntilNetworkIdle()
method you can tell Browsershot to wait for a period of 500 ms with no network activity before taking the screenshot, ensuring all additional resources are loaded.
Browsershot::url('https://example.com')
->waitUntilNetworkIdle()
->save($pathToImage);
Alternatively you can use less strict waitUntilNetworkIdle(false)
, which allows 2 network connections in the 500 ms waiting period, useful for websites with scripts periodically pinging an ajax endpoint.
Delayed screenshots
You can delay the taking of screenshot by setDelay()
. This is useful if you need to wait for completion of javascript or if you are attempting to capture lazy-loaded resources.
Browsershot::url('https://example.com')
->setDelay($delayInMilliseconds)
->save($pathToImage);
Waiting for javascript function
You can also wait for a javascript function until is returns true by using waitForFunction()
. This is useful if you need to wait for task on javascript which is not related to network status.
Browsershot::url('https://example.com')
->waitForFunction('window.innerWidth < 100', $pollingInMilliseconds, $timeoutInMilliseconds)
->save($pathToImage);
Adding JS
You can add javascript prior to your screenshot or output using the syntax for Puppeteer's addScriptTag.
Browsershot::url('https://example.com')
->setOption('addScriptTag', json_encode(['content' => 'alert("Hello World")']))
->save($pathToImage);
Adding CSS
You can add CSS styles prior to your screenshot or output using the syntax for Puppeteer's addStyleTag.
Browsershot::url('https://example.com')
->setOption('addStyleTag', json_encode(['content' => 'body{ font-size: 14px; }']))
->save($pathToImage);
Output directly to the browser
You can output the image directly to the browser using the screenshot()
method.
$image = Browsershot::url('https://example.com')
->screenshot()
PDFs
Browsershot will save a pdf if the path passed to the save
method has a pdf
extension.
// a pdf will be saved
Browsershot::url('https://example.com')->save('example.pdf');
Alternatively you can explicitly use the savePdf
method:
Browsershot::url('https://example.com')->savePdf('example.pdf');
You can also pass some html which will be converted to a pdf.
Browsershot::html($someHtml)->savePdf('example.pdf');
Sizing the pdf
You can specify the width and the height.
Browsershot::html($someHtml)
->paperSize($width, $height)
->save('example.pdf');
Optionally you can give a custom unit to the paperSize
as the third parameter.
Using a predefined format
You can use the format
method and provide a format size:
Browsershot::html('https://example.com')->format('A4')->save('example.pdf');
The format
options available by puppeteer are:
Letter: 8.5in x 11in
Legal: 8.5in x 14in
Tabloid: 11in x 17in
Ledger: 17in x 11in
A0: 33.1in x 46.8in
A1: 23.4in x 33.1in
A2: 16.54in x 23.4in
A3: 11.7in x 16.54in
A4: 8.27in x 11.7in
A5: 5.83in x 8.27in
A6: 4.13in x 5.83in
Setting margins
Margins can be set.
Browsershot::html($someHtml)
->margins($top, $right, $bottom, $left)
->save('example.pdf');
Optionally you can give a custom unit to the margins
as the fifth parameter.
Headers and footers
By default a PDF will not show the header and a footer generated by Chrome. Here's how you can make the header and footer appear. You can also provide a custom HTML template for the header and footer.
Browsershot::html($someHtml)
->showBrowserHeaderAndFooter()
->headerHtml($someHtml)
->footerHtml($someHtml)
->save('example.pdf');
In the header and footer HTML, any tags with the following classes will have its printing value injected into its contents.
date
formatted print datetitle
document titleurl
document locationpageNumber
current page numbertotalPages
total pages in the document
To hide the header or footer, you can call either hideHeader
or hideFooter
.
Backgrounds
By default, the resulting PDF will not show the background of the html page. If you do want the background to be included you can call showBackground
:
Browsershot::html($someHtml)
->showBackground()
->save('example.pdf');
Landscape orientation
Call landscape
if you want to resulting pdf to be landscape oriented.
Browsershot::html($someHtml)
->landscape()
->save('example.pdf');
Scale
Scale can be set. Defaults to 1. Scale amount must be between 0.1 and 2.
Browsershot::html($someHtml)
->scale(0.5)
->save('example.pdf');
Only export specific pages
You can control which pages should be export by passing a print range to the pages
method. Here are some examples of valid print ranges: 1
, 1-3
, 1-5, 8, 11-13
.
Browsershot::html($someHtml)
->pages('1-5, 8, 11-13')
->save('example.pdf');
Output directly to the browser
You can output the PDF directly to the browser using the pdf()
method.
$pdf = Browsershot::url('https://example.com')
->pdf()
HTML
Browsershot also can get the body of an html page after JavaScript has been executed:
Browsershot::url('https://example.com')->bodyHtml(); // returns the html of the body
Evaluate
Browsershot can get the evaluation of an html page:
Browsershot::url('https://example.com')
->deviceScaleFactor(2)
->evaluate("window.devicePixelRatio"); // returns 2
Misc
Setting an arbitrary option
You can set any arbitrary options by calling setOption
:
Browsershot::url('https://example.com')
->setOption('landscape', true)
->save($pathToImage);
Fixing cors issues
If you experience issues related to cors, you can opt to disable cors checks with --disable-web-security.
Browsershot::url('https://example.com')
->setOption('args', ['--disable-web-security'])
->save($pathToImage);
Changing the language of the browser
You can use setOption
to change the language of the browser. In order to load a page in a specific language for example.
Browsershot::url('https://example.com')
->setOption('args', '--lang=en-GB')
...
Setting the user agent
If you want to set the user agent Google Chrome should use when taking the screenshot you can do so:
Browsershot::url('https://example.com')
->userAgent('My Special Snowflake Browser 1.0')
->save($pathToImage);
Setting the CSS media type of the page
You can emulate the media type, especially useful when you're generating pdf shots, because it will try to emulate the print version of the page by default.
Browsershot::url('https://example.com')
->emulateMedia('screen') // "screen", "print" (default) or null (passing null disables the emulation).
->savePdf($pathToPdf);
Setting the timeout
The default timeout of Browsershot is set to 60 seconds. Of course, you can modify this timeout:
Browsershot::url('https://example.com')
->timeout(120)
->save($pathToImage);
Disable sandboxing
When running Linux in certain virtualization environments it might need to disable sandboxing.
Browsershot::url('https://example.com')
->noSandbox()
...
Ignore HTTPS errors
You can ignore HTTPS errors, if necessary.
Browsershot::url('https://example.com')
->ignoreHttpsErrors()
...
Specify a proxy Server
You can specify a proxy server to use when connecting. The argument passed to setProxyServer
will be passed to the --proxy-server=
option of Chromium. More info here: https://www.chromium.org/developers/design-documents/network-settings#TOC-Command-line-options-for-proxy-settings
Browsershot::url('https://example.com')
->setProxyServer("1.2.3.4:8080")
...
Setting extraHTTPHeaders
To send custom HTTP headers, set the extraHTTPHeaders option like so:
Browsershot::url('https://example.com')
->setExtraHttpHeaders(['Custom-Header-Name' => 'Custom-Header-Value'])
...
Using HTTP Authentication
You can provide credentials for HTTP authentication:
Browsershot::url('https://example.com')
->authenticate('username', 'password')
...
Using Cookies
You can add cookies to the request to the given url:
Browsershot::url('https://example.com')
->useCookies(['Cookie-Key' => 'Cookie-Value'])
...
You can specify the domain to register cookies to, if necessary:
Browsershot::url('https://example.com')
->useCookies(['Cookie-Key' => 'Cookie-Value'], 'ui.example.com')
...
Sending POST requests
By default, all requests sent using GET method. You can make POST request to the given url by using the post
method. Note: POST request sent using application/x-www-form-urlencoded
content type.
Browsershot::url('https://example.com')
->post(['foo' => 'bar'])
...
Clicking on the page
You can specify clicks on the page.
Browsershot::url('https://example.com')
->click('#selector1')
// Right click 5 times on #selector2, each click lasting 200 milliseconds.
->click('#selector2', 'right', 5, 200)
Typing on the page
You can type on the page (you can use this to fill form fields).
Browsershot::url('https://example.com')
->type('#selector1', 'Hello, is it me you are looking for?')
You can combine type
and click
to create a screenshot of a page after submitting a form:
Browsershot::url('https://example.com')
->type('#firstName', 'My name')
->click('#submit')
->delay($millisecondsToWait)
->save($pathToImage);
Changing the value of a dropdown value
You can change the value of a dropdown on the page (you can use this to change form select fields).
Browsershot::url('https://example.com')
->selectOption('#selector1', '100')
You can combine selectOption
, type
and click
to create a screenshot of a page after submitting a form:
Browsershot::url('https://example.com')
->type('#firstName', 'My name')
->selectOption('#state', 'MT')
->click('#submit')
->delay($millisecondsToWait)
->save($pathToImage);
Writing options to file
When the amount of options given to puppeteer becomes too big, Browsershot will fail because of an overflow of characters in the command line. Browsershot can write the options to a file and pass that file to puppeteer and so bypass the character overflow.
Browsershot::url('https://example.com')
->writeOptionsToFile()
...
Connection to a remote chromium/chrome instance
If you have a remote endpoint for a running chromium/chrome instance, properly configured with the param --remote-debugging-port, you can connect to it using the method setRemoteInstance
. You only need to specify it's ip and port (defaults are 127.0.0.1 and 9222 accordingly). If no instance is available at the given endpoint (instance crashed, restarting instance, etc), this will fallback to launching a chromium instance.
Browsershot::url('https://example.com')
->setRemoteInstance('1.2.3.4', 9222)
...
Using a pipe instead of a WebSocket
If you want to connect to the browser over a pipe instead of a WebSocket, you can use:
Browsershot::url('https://example.com')
->usePipe()
...
Passing environment variables to the browser
If you want to set custom environment variables which affect the browser instance you can use:
Browsershot::url('https://example.com')
->setEnvironmentOptions(['TZ' => 'Pacific/Auckland'])
...
Related packages
- Laravel wrapper: laravel-browsershot
Contributing
Please see CONTRIBUTING for details.
Security
If you discover any security related issues, please email [email protected] instead of using the issue tracker.
Alternatives
If you're not able to install Node and Puppeteer, take a look at v2 of browsershot, which uses Chrome headless CLI to take a screenshot. v2
is not maintained anymore, but should work pretty well.
If using headless Chrome does not work for you take a look at at v1
of this package which uses the abandoned PhantomJS
binary.
Credits
And a special thanks to Caneco for the logo
License
The MIT License (MIT). Please see License File for more information.