Skip to content

Commit

Permalink
Generating and committing HTML pages
Browse files Browse the repository at this point in the history
gvwilson committed Nov 23, 2014
1 parent 00bf679 commit f13a14a
Showing 21 changed files with 3,720 additions and 101 deletions.
506 changes: 506 additions & 0 deletions 01-select.html

Large diffs are not rendered by default.

396 changes: 396 additions & 0 deletions 02-sort-dup.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,396 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="generator" content="pandoc">
<title>Software Carpentry: Databases and SQL</title>
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico" />
<link rel="stylesheet" href="css/bootstrap/bootstrap.css" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<link rel="stylesheet" href="css/bootstrap/bootstrap-responsive.css" />
<link rel="stylesheet" type="text/css" href="css/swc.css" />
<link rel="alternate" type="application/rss+xml" title="Software Carpentry Blog" href="http://software-carpentry.org/feed.xml"/>
<meta charset="UTF-8" />
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
</head>
<body>
<div class="container">
<div class="banner">
<a href="http://software-carpentry.org" title="Software Carpentry">
<img alt="Software Carpentry banner" src="img/software-carpentry-banner.png" />
</a>
</div>
<div class="row-fluid">
<div class="span10 offset1">
<h1 class="title">Databases and SQL</h1>
<h2 class="subtitle">Sorting and Removing Duplicates</h2>
<div id="learning-objectives" class="objectives">
<h2>Learning Objectives</h2>
<ul>
<li>Write queries that display results in a particular order.</li>
<li>Write queries that eliminate duplicate values from data.</li>
</ul>
</div>
<p>Data is often redundant, so queries often return redundant information. For example, if we select the quantitites that have been measured from the <code>survey</code> table, we get this:</p>
<pre class="sourceCode sql"><code class="sourceCode sql"><span class="kw">select</span> quant <span class="kw">from</span> Survey;</code></pre>
<h2 id="quant">quant</h2>
<p>rad<br />sal<br />rad<br />sal<br />rad<br />sal<br />temp<br />rad<br />sal<br />temp<br />rad<br />temp<br />sal<br />rad<br />sal<br />temp<br />sal<br />rad<br />sal<br />sal<br />rad</p>
<p>We can eliminate the redundant output to make the result more readable by adding the <code>distinct</code> keyword to our query:</p>
<pre class="sourceCode sql"><code class="sourceCode sql"><span class="kw">select</span> <span class="kw">distinct</span> quant <span class="kw">from</span> Survey;</code></pre>
<h2 id="quant-1">quant</h2>
<p>rad<br />sal<br />temp</p>
<p>If we select more than one column --- for example, both the survey site ID and the quantity measured --- then the distinct pairs of values are returned:</p>
<pre class="sourceCode sql"><code class="sourceCode sql"><span class="kw">select</span> <span class="kw">distinct</span> taken, quant <span class="kw">from</span> Survey;</code></pre>
<table>
<thead>
<tr class="header">
<th align="left">taken</th>
<th align="left">quant</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="left">619</td>
<td align="left">rad</td>
</tr>
<tr class="even">
<td align="left">619</td>
<td align="left">sal</td>
</tr>
<tr class="odd">
<td align="left">622</td>
<td align="left">rad</td>
</tr>
<tr class="even">
<td align="left">622</td>
<td align="left">sal</td>
</tr>
<tr class="odd">
<td align="left">734</td>
<td align="left">rad</td>
</tr>
<tr class="even">
<td align="left">734</td>
<td align="left">sal</td>
</tr>
<tr class="odd">
<td align="left">734</td>
<td align="left">temp</td>
</tr>
<tr class="even">
<td align="left">735</td>
<td align="left">rad</td>
</tr>
<tr class="odd">
<td align="left">735</td>
<td align="left">sal</td>
</tr>
<tr class="even">
<td align="left">735</td>
<td align="left">temp</td>
</tr>
<tr class="odd">
<td align="left">751</td>
<td align="left">rad</td>
</tr>
<tr class="even">
<td align="left">751</td>
<td align="left">temp</td>
</tr>
<tr class="odd">
<td align="left">751</td>
<td align="left">sal</td>
</tr>
<tr class="even">
<td align="left">752</td>
<td align="left">rad</td>
</tr>
<tr class="odd">
<td align="left">752</td>
<td align="left">sal</td>
</tr>
<tr class="even">
<td align="left">752</td>
<td align="left">temp</td>
</tr>
<tr class="odd">
<td align="left">837</td>
<td align="left">rad</td>
</tr>
<tr class="even">
<td align="left">837</td>
<td align="left">sal</td>
</tr>
<tr class="odd">
<td align="left">844</td>
<td align="left">rad</td>
</tr>
</tbody>
</table>
<p>Notice in both cases that duplicates are removed even if they didn't appear to be adjacent in the database. Again, it's important to remember that rows aren't actually ordered: they're just displayed that way.</p>
<p>As we mentioned earlier, database records are not stored in any particular order. This means that query results aren't necessarily sorted, and even if they are, we often want to sort them in a different way, e.g., by the name of the project instead of by the name of the scientist. We can do this in SQL by adding an <code>order by</code> clause to our query:</p>
<pre class="sourceCode sql"><code class="sourceCode sql"><span class="kw">select</span> * <span class="kw">from</span> Person <span class="kw">order</span> <span class="kw">by</span> ident;</code></pre>
<table>
<thead>
<tr class="header">
<th align="left">ident</th>
<th align="left">personal</th>
<th align="left">family</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="left">danforth</td>
<td align="left">Frank</td>
<td align="left">Danforth</td>
</tr>
<tr class="even">
<td align="left">dyer</td>
<td align="left">William</td>
<td align="left">Dyer</td>
</tr>
<tr class="odd">
<td align="left">lake</td>
<td align="left">Anderson</td>
<td align="left">Lake</td>
</tr>
<tr class="even">
<td align="left">pb</td>
<td align="left">Frank</td>
<td align="left">Pabodie</td>
</tr>
<tr class="odd">
<td align="left">roe</td>
<td align="left">Valentina</td>
<td align="left">Roerich</td>
</tr>
</tbody>
</table>
<p>By default, results are sorted in ascending order (i.e., from least to greatest). We can sort in the opposite order using <code>desc</code> (for &quot;descending&quot;):</p>
<pre class="sourceCode sql"><code class="sourceCode sql"><span class="kw">select</span> * <span class="kw">from</span> person <span class="kw">order</span> <span class="kw">by</span> ident <span class="kw">desc</span>;</code></pre>
<table>
<thead>
<tr class="header">
<th align="left">ident</th>
<th align="left">personal</th>
<th align="left">family</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="left">roe</td>
<td align="left">Valentina</td>
<td align="left">Roerich</td>
</tr>
<tr class="even">
<td align="left">pb</td>
<td align="left">Frank</td>
<td align="left">Pabodie</td>
</tr>
<tr class="odd">
<td align="left">lake</td>
<td align="left">Anderson</td>
<td align="left">Lake</td>
</tr>
<tr class="even">
<td align="left">dyer</td>
<td align="left">William</td>
<td align="left">Dyer</td>
</tr>
<tr class="odd">
<td align="left">danforth</td>
<td align="left">Frank</td>
<td align="left">Danforth</td>
</tr>
</tbody>
</table>
<p>(And if we want to make it clear that we're sorting in ascending order, we can use <code>asc</code> instead of <code>desc</code>.)</p>
<p>We can also sort on several fields at once. For example, this query sorts results first in ascending order by <code>taken</code>, and then in descending order by <code>person</code> within each group of equal <code>taken</code> values:</p>
<pre class="sourceCode sql"><code class="sourceCode sql"><span class="kw">select</span> taken, person <span class="kw">from</span> Survey <span class="kw">order</span> <span class="kw">by</span> taken <span class="kw">asc</span>, person <span class="kw">desc</span>;</code></pre>
<table>
<thead>
<tr class="header">
<th align="left">taken</th>
<th align="left">person</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="left">619</td>
<td align="left">dyer</td>
</tr>
<tr class="even">
<td align="left">619</td>
<td align="left">dyer</td>
</tr>
<tr class="odd">
<td align="left">622</td>
<td align="left">dyer</td>
</tr>
<tr class="even">
<td align="left">622</td>
<td align="left">dyer</td>
</tr>
<tr class="odd">
<td align="left">734</td>
<td align="left">pb</td>
</tr>
<tr class="even">
<td align="left">734</td>
<td align="left">pb</td>
</tr>
<tr class="odd">
<td align="left">734</td>
<td align="left">lake</td>
</tr>
<tr class="even">
<td align="left">735</td>
<td align="left">pb</td>
</tr>
<tr class="odd">
<td align="left">735</td>
<td align="left">---</td>
</tr>
<tr class="even">
<td align="left">735</td>
<td align="left">---</td>
</tr>
<tr class="odd">
<td align="left">751</td>
<td align="left">pb</td>
</tr>
<tr class="even">
<td align="left">751</td>
<td align="left">pb</td>
</tr>
<tr class="odd">
<td align="left">751</td>
<td align="left">lake</td>
</tr>
<tr class="even">
<td align="left">752</td>
<td align="left">roe</td>
</tr>
<tr class="odd">
<td align="left">752</td>
<td align="left">lake</td>
</tr>
<tr class="even">
<td align="left">752</td>
<td align="left">lake</td>
</tr>
<tr class="odd">
<td align="left">752</td>
<td align="left">lake</td>
</tr>
<tr class="even">
<td align="left">837</td>
<td align="left">roe</td>
</tr>
<tr class="odd">
<td align="left">837</td>
<td align="left">lake</td>
</tr>
<tr class="even">
<td align="left">837</td>
<td align="left">lake</td>
</tr>
<tr class="odd">
<td align="left">844</td>
<td align="left">roe</td>
</tr>
</tbody>
</table>
<p>This is easier to understand if we also remove duplicates:</p>
<pre class="sourceCode sql"><code class="sourceCode sql"><span class="kw">select</span> <span class="kw">distinct</span> taken, person <span class="kw">from</span> Survey <span class="kw">order</span> <span class="kw">by</span> taken <span class="kw">asc</span>, person <span class="kw">desc</span>;</code></pre>
<table>
<thead>
<tr class="header">
<th align="left">taken</th>
<th align="left">person</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="left">619</td>
<td align="left">dyer</td>
</tr>
<tr class="even">
<td align="left">622</td>
<td align="left">dyer</td>
</tr>
<tr class="odd">
<td align="left">734</td>
<td align="left">pb</td>
</tr>
<tr class="even">
<td align="left">734</td>
<td align="left">lake</td>
</tr>
<tr class="odd">
<td align="left">735</td>
<td align="left">pb</td>
</tr>
<tr class="even">
<td align="left">735</td>
<td align="left">---</td>
</tr>
<tr class="odd">
<td align="left">751</td>
<td align="left">pb</td>
</tr>
<tr class="even">
<td align="left">751</td>
<td align="left">lake</td>
</tr>
<tr class="odd">
<td align="left">752</td>
<td align="left">roe</td>
</tr>
<tr class="even">
<td align="left">752</td>
<td align="left">lake</td>
</tr>
<tr class="odd">
<td align="left">837</td>
<td align="left">roe</td>
</tr>
<tr class="even">
<td align="left">837</td>
<td align="left">lake</td>
</tr>
<tr class="odd">
<td align="left">844</td>
<td align="left">roe</td>
</tr>
</tbody>
</table>
<div id="fixme" class="challenge">
<h2>FIXME</h2>
<p>Write a query that selects distinct dates from the <code>Site</code> table.</p>
</div>
<div id="fixme-1" class="challenge">
<h2>FIXME</h2>
<p>Write a query that returns the distinct dates in the <code>Visited</code> table.</p>
</div>
<div id="fixme-2" class="challenge">
<h2>FIXME</h2>
<p>Write a query that displays the full names of the scientists in the <code>Person</code> table, ordered by family name.</p>
</div>
</div>
</div>
<div class="footer">
<a class="label swc-blue-bg" href="mailto:admin@software-carpentry.org">Email</a>
<a class="label swc-blue-bg" href="http://github.com/swcarpentry">GitHub</a>
<a class="label swc-blue-bg" href="LICENSE.html">License</a>
<a class="bugreport label swc-blue-bg" href="mailto:admin@software-carpentry.org?subject=bug%20in%20{{page.path}}">Bug Report</a>
</div>
</div>
<!-- Javascript placed at the end of the document so the pages load faster -->
<script src="http://software-carpentry.org/v5/js/jquery-1.9.1.min.js"></script>
<script src="http://software-carpentry.org/v5/js/bootstrap/bootstrap.min.js"></script>
</body>
</html>
Loading
Oops, something went wrong.

0 comments on commit f13a14a

Please sign in to comment.