Documentation outline and placeholders about the source/design.

More content to be filled out in follow-ups. PiperOrigin-RevId: 334826379
hzeller · Oct 2, 2020 · 34f17eb · 34f17eb
1 parent 28bd812
commit 34f17eb
Show file tree

Hide file tree

Showing 3 changed files with 56 additions and 6 deletions.
diff --git a/common/README.md b/common/README.md
@@ -1,10 +1,7 @@
 # Verible's Language-Agnostic Core Library
 
-Despite this package living under `//third_party/verible/common`, the classes
-and functions defined herein are language-agnostic and have nothing to do with
-Verilog.
-
-TODO(b/140521618): Migrate out of `//third_party/verible`.
+The classes and functions defined herein are language-agnostic and have nothing
+to do with specific languages like Verilog.
 
 ## Subdirectory Summary
 

diff --git a/doc/development.md b/doc/development.md
@@ -1,4 +1,4 @@
-# Development resources
+# Development Resources
 
 This document will help getting started [contributing](../CONTRIBUTING.md) to
 Verible. Collecting development aids and design concepts.
@@ -9,3 +9,18 @@ https://cs.opensource.google/verible/verible
 
 To learn more about how to use Kythe to
 [index the source code yourself, read here](./indexing.md).
+
+## Code Organization
+
+*   common/ contains all language-agnostic library code
+*   verilog/ contains Verilog-specific libraries and tools
+
+## Verilog Front-End
+
+*   [Lexer and Parser](./parser_design.md)
+
+## Analyzers
+
+## Transformers
+
+## Formatting
diff --git a/doc/parser_design.md b/doc/parser_design.md
@@ -0,0 +1,38 @@
+# Verible SystemVerilog Parser Design
+
+Verible uses traditional tools like Flex and Yacc/Bison, but *not* in the usual
+manner where the generated `yyparse()` function calls `yylex()` directly.
+Instead, the generated lexer and parser are completely decoupled.
+
+## Lexer
+
+The lexer is generated using Flex. It supports both SystemVerilog
+language-proper lexical tokens as well as preprocessing directives, macros, and
+other directives, all-in-one. The Flex-generated code is then wrapped into an
+interface that returns tokens one-by-one, until ending with a special EOF token.
+
+## Contextualizer
+
+The contextualizer is a pass over the token stream produced by the lexer that
+can help disambiguate tokens with multiple interpretations. This in turn, helps
+simplify the grammar in the parser implementation in ways that would not be
+possible with a context-free LR parser alone.
+
+## Preprocessor
+
+There is no standalone preprocessor yet.
+
+## Filter
+
+Token streams can be filtered into token stream views in various ways. One of
+the most useful filters hides comments and attributes so that the resulting view
+can be parsed.
+
+## Parser
+
+The parser is generated using Bison, using the LALR(1) algorithm. The grammar
+implemented is unconventional in that it explicitly handles preprocessing
+constructs, directives, macros -- this allows it to operate on a
+limited-but-useful subset of *unpreprocessed* code. The parser constructs a
+concrete syntax tree (CST) that captures all non-comment tokens, including
+punctuation and syntactic sugar.