Skip to content

Commit

Permalink
finally adding shell lessons and cheatsheets for shell and sql
Browse files Browse the repository at this point in the history
kcranston committed Jun 20, 2014
1 parent dd1988c commit 9aa1327
Showing 17 changed files with 39,049 additions and 0 deletions.
105 changes: 105 additions & 0 deletions cheatsheets/shell.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
#Introduction to the shell

Based on the very detailed material at [http://software-carpentry.org/v5/novice/shell/index.html](http://software-carpentry.org/v5/novice/shell/index.html).

##Objectives

* Introduce concept of command-line interface
* Understand the advantages of text files over other file types
* Explain how to navigate your computer's directory structures and understand paths
* Bash commands: using flags, finding documentation
* Move, copy and delete files
* Using tab completion for efficiency
* Use of wildcards for pattern matching


##Commands and hints

### Navigating, creating and moving things
Printing working directory:

$ pwd

List contents of a directory:

$ ls

Change directories:

$ cd
$ cd ..

Create and remove directories:

$ mkdir
$ rmdir

Copy, move, delete:

$ cp
$ mv
$ rm

*Hint: tab completion makes you more efficient and less error-prone*

Create and edit a file:

$ touch readme.txt
$ nano readme.txt

###Working with file contents

Seeing the contents of a file:

$ cat
$ less
$ head
$ tail

How big is this file?

$ ls -lh
$ wc

Redirecting output to a file:

$ wc -l *.csv > number_lines.csv

Sorting the contents of a file:

$ sort number_lines.csv

Pipe commands together:

$ wc -l *.csv | sort

Getting subset of rows:

$head
$tail

Getting subset of columns (in this case, 3rd column of a file called inputfile.csv):

$ cut -f 3 -d , inputfile.csv

## Finding things

In files and in directories:

$find
$grep

Finding help (manual, or 'man' pages):

$man <command>

## What did I do?

$history







Binary file added cheatsheets/shell.pdf
Binary file not shown.
132 changes: 132 additions & 0 deletions cheatsheets/sql.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
Software Carpentry SQL Cheat Sheet
==================================

Basic Queries
-------------

Select one or more columns of data from a table:

SELECT column_name_1, column_name_2 FROM table_name;

Select all of the columns in a table:

SELECT * FROM table_name;

Get only unique lines in a query:

SELECT DISTINCT column_name FROM table_name;

Perform calculations in a query:

SELECT column_name_1, ROUND(column_name_2 / 1000.0) FROM table_name;


Filtering
---------

Select only the data meeting certain criteria:

SELECT * FROM table_name WHERE column_name = 'Hello World';

Combine conditions:

SELECT * FROM table_name WHERE (column_name_1 >= 1000) AND (column_name_2 = 'A' OR column_name_2 = 'B');


Sorting
-------

Sort results using `ASC` for ascending order or `DESC` for descending order:

SELECT * FROM table_name ORDER BY column_name_1 ASC, column_name_2 DESC;


Missing Data
------------

Use `NULL` to represent missing data.

`NULL` is neither true nor false.
Operations involving `NULL` produce `NULL`, e.g., `1+NULL`, `2>NULL`, and `3=NULL` are all `NULL`.

Test whether a value is null:

SELECT * FROM table_name WHERE column_name IS NULL;

Test whether a value is not null:

SELECT * FROM table_name WHERE column_name IS NOT NULL;


Grouping and Aggregation
------------------------

Combine data into groups and calculate combined values in groups:

SELECT column_name_1, SUM(column_name_2), COUNT(*) FROM table_name GROUP BY column_name_1;


Joins
-----

Join data from two tables:

SELECT * FROM table_name_1 JOIN table_name_2 ON table_name_1.column_name = table_name_2.column_name;


Combining Commands
------------------

SQL commands must be combined in the following order:
`SELECT`, `FROM`, `JOIN`, `ON`, `WHERE`, `GROUP BY`, `ORDER BY`.


Creating Tables
---------------

Create tables by specifying column names and types.
Include primary and foreign key relationships and other constraints.

CREATE TABLE survey(
taken INTEGER NOT NULL,
person TEXT,
quant REAL NOT NULL,
PRIMARY KEY(taken, quant),
FOREIGN KEY(person) REFERENCES person(ident)
);

Transactions
------------

Put multiple queries in a transaction to ensure they are ACID
(atomic, consistent, isolated, and durable):

BEGIN TRANSACTION;
DELETE FROM table_name_1 WHERE condition;
INSERT INTO table_name_2 values(...);
END TRANSACTION;

Programming
-----------

Execute queries in a general-purpose programming language by:

* loading the appropriate library
* creating a connection
* creating a cursor
* repeatedly:
* execute a query
* fetch some or all results
* disposing of the cursor
* closing the connection

Python example:

import sqlite3
connection = sqlite3.connect("database_name")
cursor = connection.cursor()
cursor.execute("...query...")
for r in cursor.fetchall():
...process result r...
cursor.close()
connection.close()
Binary file added cheatsheets/sql.pdf
Binary file not shown.
94 changes: 94 additions & 0 deletions lessons/shell/00-intro.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
layout: lesson
root: ../..
title: Introducing the Shell
---
<div class="objectives" markdown="1">

#### Objectives
* Explain how the shell relates to the keyboard, the screen, the operating system, and users' programs.
* Explain when and why command-line interfaces should be used instead of graphical interfaces.

</div>

#### What and Why

At a high level, computers do four things:

- run programs;
- store data;
- communicate with each other; and
- interact with us.

They can do the last of these in many different ways,
including direct brain-computer links and speech interfaces.
Since these are still in their infancy,
most of us use windows, icons, mice, and pointers.
These technologies didn't become widespread until the 1980s,
but their roots go back to Doug Engelbart's work in the 1960s,
which you can see in what has been called
"[The Mother of All Demos](http://www.youtube.com/watch?v=a11JDLBXtPQ)".

Going back even further,
the only way to interact with early computers was to rewire them.
But in between,
from the 1950s to the 1980s,
most people used line printers.
These devices only allowed input and output of the letters, numbers, and punctuation found on a standard keyboard,
so programming languages and interfaces had to be designed around that constraint.

This kind of interface is called a
[command-line interface](../../gloss.html#cli), or CLI,
to distinguish it from the
[graphical user interface](../../gloss.html#gui), or GUI,
that most people now use.
The heart of a CLI is a [read-evaluate-print loop](../../gloss.html#repl), or REPL:
when the user types a command and then presses the enter (or return) key,
the computer reads it,
executes it,
and prints its output.
The user then types another command,
and so on until the user logs off.

This description makes it sound as though the user sends commands directly to the computer,
and the computer sends output directly to the user.
In fact,
there is usually a program in between called a
[command shell](../../gloss.html#shell).
What the user types goes into the shell;
it figures out what commands to run and orders the computer to execute them.

A shell is a program like any other.
What's special about it is that its job is to run other programs
rather than to do calculations itself.
The most popular Unix shell is Bash,
the Bourne Again SHell
(so-called because it's derived from a shell written by Stephen Bourne&mdash;this
is what passes for wit among programmers).
Bash is the default shell on most modern implementations of Unix,
and in most packages that provide Unix-like tools for Windows.

Using Bash or any other shell
sometimes feels more like programming than like using a mouse.
Commands are terse (often only a couple of characters long),
their names are frequently cryptic,
and their output is lines of text rather than something visual like a graph.
On the other hand,
the shell allows us to combine existing tools in powerful ways with only a few keystrokes
and to set up pipelines to handle large volumes of data automatically.
In addition,
the command line is often the easiest way to interact with remote machines.
As clusters and cloud computing become more popular for scientific data crunching,
being able to drive them is becoming a necessary skill.

<div class="keypoints" markdown="1">

#### Key Points
* A shell is a program whose primary purpose is to read commands and run other programs.
* The shell's main advantages are its high action-to-keystroke ratio,
its support for automating repetitive tasks,
and that it can be used to access networked machines.
* The shell's main disadvantages are its primarily textual nature
and how cryptic its commands and operation can be.

</div>
Loading
Oops, something went wrong.

0 comments on commit 9aa1327

Please sign in to comment.