<!DOCTYPE html>
<!-- saved from url=(0057)http://jsoup.org/cookbook/extracting-data/selector-syntax -->
<html><head><meta http-equiv="Content-Type" content="text/html; charset=windows-1254">
<title>Use selector-syntax to find elements: jsoup Java HTML parser</title>
<meta name="keywords" content="select, selector, css, jquery">
<meta name="description" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link type="text/css" rel="stylesheet" href="./jsoup parse selectors_files/style.css">
<script async="" src="./jsoup parse selectors_files/analytics.js"></script><script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-89734-10', 'auto');
ga('send', 'pageview');
</script>
</head>
<body class="n1-cookbook">
<div class="wrap">
<div class="header">
<div class="nav-sections">
<ul>
<li class="n1-home"><h4><a href="http://jsoup.org/">jsoup</a></h4></li>
<li class="n1-news"><a href="http://jsoup.org/news/">News</a></li>
<li class="n1-bugs"><a href="http://jsoup.org/bugs">Bugs</a></li>
<li class="n1-discussion"><a href="http://jsoup.org/discussion">Discussion</a></li>
<li class="n1-download"><a href="http://jsoup.org/download">Download</a></li>
<li class="n1-api"><a href="http://jsoup.org/apidocs/">API Reference</a></li>
<li class="n1-cookbook"><a href="http://jsoup.org/cookbook/">Cookbook</a></li>
<li class="n1-try"><a href="http://try.jsoup.org/">Try jsoup</a></li>
</ul>
</div>
</div>
<div class="breadcrumb">
<a href="http://jsoup.org/">jsoup</a>
<span class="seperator">»</span>
<a href="http://jsoup.org/cookbook/">Cookbook</a>
<span class="seperator">»</span>
<a href="http://jsoup.org/cookbook/extracting-data/">Extracting data</a>
<span class="seperator">»</span> Use selector-syntax to find elements
</div>
<div class="content">
<div class="col1">
<div class="recipe">
<h1>Use selector-syntax to find elements</h1>
<h2>Problem</h2>
<p>You want to find or manipulate elements using a CSS or jquery-like selector syntax.</p>
<h2>Solution</h2>
<p>Use the <code><a href="http://jsoup.org/apidocs/org/jsoup/nodes/Element.html#select(java.lang.String)" title="Find elements that match the Selector CSS query, with this element as the starting context.">Element.select(String selector)</a></code> and <code><a href="http://jsoup.org/apidocs/org/jsoup/select/Elements.html#select(java.lang.String)" title="Find matching elements within this element list.">Elements.select(String selector)</a></code> methods:</p>
<pre><code class="prettyprint"><span class="typ">File</span><span class="pln"> input </span><span class="pun">=</span><span class="pln"> </span><span class="kwd">new</span><span class="pln"> </span><span class="typ">File</span><span class="pun">(</span><span class="str">"/tmp/input.html"</span><span class="pun">);</span><span class="pln"><br></span><span class="typ">Document</span><span class="pln"> doc </span><span class="pun">=</span><span class="pln"> </span><span class="typ">Jsoup</span><span class="pun">.</span><span class="pln">parse</span><span class="pun">(</span><span class="pln">input</span><span class="pun">,</span><span class="pln"> </span><span class="str">"UTF-8"</span><span class="pun">,</span><span class="pln"> </span><span class="str">"http://example.com/"</span><span class="pun">);</span><span class="pln"><br><br></span><span class="typ">Elements</span><span class="pln"> links </span><span class="pun">=</span><span class="pln"> doc</span><span class="pun">.</span><span class="kwd">select</span><span class="pun">(</span><span class="str">"a[href]"</span><span class="pun">);</span><span class="pln"> </span><span class="com">// a with href</span><span class="pln"><br></span><span class="typ">Elements</span><span class="pln"> pngs </span><span class="pun">=</span><span class="pln"> doc</span><span class="pun">.</span><span class="kwd">select</span><span class="pun">(</span><span class="str">"img[src$=.png]"</span><span class="pun">);</span><span class="pln"><br> </span><span class="com">// img with src ending .png</span><span class="pln"><br><br></span><span class="typ">Element</span><span class="pln"> masthead </span><span class="pun">=</span><span class="pln"> doc</span><span class="pun">.</span><span class="kwd">select</span><span class="pun">(</span><span class="str">"div.masthead"</span><span class="pun">).</span><span class="pln">first</span><span class="pun">();</span><span class="pln"><br> </span><span class="com">// div with class=masthead</span><span class="pln"><br><br></span><span class="typ">Elements</span><span class="pln"> resultLinks </span><span class="pun">=</span><span class="pln"> doc</span><span class="pun">.</span><span class="kwd">select</span><span class="pun">(</span><span class="str">"h3.r > a"</span><span class="pun">);</span><span class="pln"> </span><span class="com">// direct a after h3</span></code></pre>
<h2>Description</h2>
<p>jsoup elements support a <a href="http://www.w3.org/TR/2009/PR-css3-selectors-20091215/">CSS</a> (or <a href="http://jquery.com/">jquery</a>) like selector syntax to find matching elements, that allows very powerful and robust queries.</p>
<p>The <code>select</code> method is available in a <code><a href="http://jsoup.org/apidocs/org/jsoup/nodes/Document.html" title="A HTML Document.">Document</a></code>, <code><a href="http://jsoup.org/apidocs/org/jsoup/nodes/Element.html" title="A HTML element consists of a tag name, attributes, and child nodes (including text nodes and other elements).">Element</a></code>, or in <code><a href="http://jsoup.org/apidocs/org/jsoup/select/Elements.html" title="A list of Elements, with methods that act on every element in the list.">Elements</a></code>. It is contextual, so you can filter by selecting from a specific element, or by chaining select calls.</p>
<p>Select returns a list of Elements (as <code><a href="http://jsoup.org/apidocs/org/jsoup/select/Elements.html" title="A list of Elements, with methods that act on every element in the list.">Elements</a></code>), which provides a range of methods to extract and manipulate the results.</p>
<h3>Selector overview</h3>
<ul>
<li><code>tagname</code>: find elements by tag, e.g. <code><a href="http://jsoup.org/apidocs/org/jsoup/select/Evaluator.CssNthEvaluator.html#a">a</a></code></li>
<li><code>ns|tag</code>: find elements by tag in a namespace, e.g. <code>fb|name</code> finds <code><fb:name></code> elements</li>
<li><code>#id</code>: find elements by ID, e.g. <code>#logo</code></li>
<li><code>.class</code>: find elements by class name, e.g. <code>.masthead</code></li>
<li><code>[attribute]</code>: elements with attribute, e.g. <code>[href]</code></li>
<li><code>[^attr]</code>: elements with an attribute name prefix, e.g. <code>[^data-]</code> finds elements with HTML5 dataset attributes</li>
<li><code>[attr=value]</code>: elements with attribute value, e.g. <code>[width=500]</code> (also quotable, like <code><a href="http://jsoup.org/cookbook/extracting-data/data-name=%22launch">sequence"</a></code>)</li>
<li><code>[attr^=value]</code>, <code>[attr$=value]</code>, <code>[attr*=value]</code>: elements with attributes that start with, end with, or contain the value, e.g. <code>[href*=/path/]</code></li>
<li><code>[attr~=regex]</code>: elements with attribute values that match the regular expression; e.g. <code>img[src~=(?i)\.(png|jpe?g)]</code></li>
<li><code>*</code>: all elements, e.g. <code>*</code></li>
</ul>
<h3>Selector combinations</h3>
<ul>
<li><code>el#id</code>: elements with ID, e.g. <code>div#logo</code></li>
<li><code>el.class</code>: elements with class, e.g. <code>div.masthead</code></li>
<li><code>el[attr]</code>: elements with attribute, e.g. <code>a[href]</code></li>
<li>Any combination, e.g. <code>a[href].highlight</code></li>
<li><code>ancestor child</code>: child elements that descend from ancestor, e.g. <code>.body p</code> finds <code>p</code> elements anywhere under a block with class "body"</li>
<li><code>parent > child</code>: child elements that descend directly from parent, e.g. <code>div.content > p</code> finds <code>p</code> elements; and <code>body > *</code> finds the direct children of the body tag</li>
<li><code>siblingA + siblingB</code>: finds sibling B element immediately preceded by sibling A, e.g. <code>div.head + div</code></li>
<li><code>siblingA ~ siblingX</code>: finds sibling X element preceded by sibling A, e.g. <code>h1 ~ p</code></li>
<li><code>el, el, el</code>: group multiple selectors, find unique elements that match any of the selectors; e.g. <code>div.masthead, div.logo</code></li>
</ul>
<h3>Pseudo selectors</h3>
<ul>
<li><code>:lt(n)</code>: find elements whose sibling index (i.e. its position in the DOM tree relative to its parent) is less than <code>n</code>; e.g. <code>td:lt(3)</code></li>
<li><code>:gt(n)</code>: find elements whose sibling index is greater than <code>n</code>; e.g. <code>div p:gt(2)</code></li>
<li><code>:eq(n)</code>: find elements whose sibling index is equal to <code>n</code>; e.g. <code>form input:eq(1)</code></li>
<li><code>:has(seletor)</code>: find elements that contain elements matching the selector; e.g. <code>div:has(p)</code></li>
<li><code>:not(selector)</code>: find elements that do not match the selector; e.g. <code>div:not(.logo)</code></li>
<li><code>:contains(text)</code>: find elements that contain the given text. The search is case-insensitive; e.g. <code>p:contains(jsoup)</code></li>
<li><code>:containsOwn(text)</code>: find elements that directly contain the given text</li>
<li><code>:matches(regex)</code>: find elements whose text matches the specified regular expression; e.g. <code>div:matches((?i)login)</code></li>
<li><code>:matchesOwn(regex)</code>: find elements whose own text matches the specified regular expression</li>
<li>Note that the above indexed pseudo-selectors are 0-based, that is, the first element is at index 0, the second at 1, etc</li>
</ul>
<p>See the <code><a href="http://jsoup.org/apidocs/org/jsoup/select/Selector.html" title="CSS-like element selector, that finds elements matching a query.">Selector</a></code> API reference for the full supported list and details.</p>
</div>
</div>
<!-- /col1 -->
<div class="col2">
<div class="toc box">
<h2><a href="http://jsoup.org/cookbook"></a>Cookbook contents</h2>
<h3>Introduction</h3>
<ol start="1">
<li><a href="http://jsoup.org/cookbook/introduction/parsing-a-document">Parsing and traversing a Document</a></li>
</ol>
<h3>Input</h3>
<ol start="2">
<li><a href="http://jsoup.org/cookbook/input/parse-document-from-string">Parse a document from a String</a></li>
<li><a href="http://jsoup.org/cookbook/input/parse-body-fragment">Parsing a body fragment</a></li>
<li><a href="http://jsoup.org/cookbook/input/load-document-from-url">Load a Document from a URL</a></li>
<li><a href="http://jsoup.org/cookbook/input/load-document-from-file">Load a Document from a File</a></li>
</ol>
<h3>Extracting data</h3>
<ol start="6">
<li><a href="http://jsoup.org/cookbook/extracting-data/dom-navigation">Use DOM methods to navigate a document</a></li>
<li class="activePage">Use selector-syntax to find elements</li>
<li><a href="http://jsoup.org/cookbook/extracting-data/attributes-text-html">Extract attributes, text, and HTML from elements</a></li>
<li><a href="http://jsoup.org/cookbook/extracting-data/working-with-urls">Working with URLs</a></li>
<li><a href="http://jsoup.org/cookbook/extracting-data/example-list-links">Example program: list links</a></li>
</ol>
<h3>Modifying data</h3>
<ol start="11">
<li><a href="http://jsoup.org/cookbook/modifying-data/set-attributes">Set attribute values</a></li>
<li><a href="http://jsoup.org/cookbook/modifying-data/set-html">Set the HTML of an element</a></li>
<li><a href="http://jsoup.org/cookbook/modifying-data/set-text">Setting the text content of elements</a></li>
</ol>
<h3>Cleaning HTML</h3>
<ol start="14">
<li><a href="http://jsoup.org/cookbook/cleaning-html/whitelist-sanitizer">Sanitize untrusted HTML (to prevent XSS)</a></li>
</ol>
</div>
</div>
<!-- /col2 -->
</div>
<!-- /content-->
<div class="footer">
<b>jsoup HTML parser</b> © 2009 - 2015
<a href="http://jhy.io/" rel="author"><b>Jonathan Hedley</b></a>
</div>
</div>
<!-- /wrap -->
<script src="./jsoup parse selectors_files/prettify.js"></script>
<script>prettyPrint();</script>
</body></html>