diff --git a/keps/sig-scheduling/20180409-scheduling-framework-extensions.png b/keps/sig-scheduling/20180409-scheduling-framework-extensions.png
index 25f50471010..e2c1a2f841e 100644
Binary files a/keps/sig-scheduling/20180409-scheduling-framework-extensions.png and b/keps/sig-scheduling/20180409-scheduling-framework-extensions.png differ
diff --git a/keps/sig-scheduling/20180409-scheduling-framework-threads.png b/keps/sig-scheduling/20180409-scheduling-framework-threads.png
index ae9e1965d6d..34c2bde759c 100644
Binary files a/keps/sig-scheduling/20180409-scheduling-framework-threads.png and b/keps/sig-scheduling/20180409-scheduling-framework-threads.png differ
diff --git a/keps/sig-scheduling/20180409-scheduling-framework.md b/keps/sig-scheduling/20180409-scheduling-framework.md
index 5caf7ff9d9d..ff69f209f35 100644
--- a/keps/sig-scheduling/20180409-scheduling-framework.md
+++ b/keps/sig-scheduling/20180409-scheduling-framework.md
@@ -1,7 +1,9 @@
 ---
+kep-number: 34
 title: Scheduling Framework
 authors:
-  - "@bsalamat"
+  - '@bsalamat'
+  - '@misterikkit'
 owning-sig: sig-scheduling
 participating-sigs: []
 reviewers:
@@ -10,443 +12,648 @@ approvers:
   - TBD
 editor: TBD
 creation-date: 2018-04-09
-last-updated: 2018-08-15
+last-updated: 2019-01-29
 status: draft
 see-also: []
 replaces:
-  - https://github.com/kubernetes/community/blob/master/contributors/design-proposals/scheduling/scheduling-framework.md
+  - >-
+    https://github.com/kubernetes/community/blob/master/contributors/design-proposals/scheduling/scheduling-framework.md
 superseded-by: []
 ---
-
 # Scheduling Framework
 
-<!--
-The following is copied without modification from https://github.com/kubernetes/community/blob/master/contributors/design-proposals/scheduling/scheduling-framework.md
--->
-
-- [SUMMARY ](#summary-)
-- [OBJECTIVE](#objective)
-   - [Terminology](#terminology)
-- [BACKGROUND](#background)
-- [OVERVIEW](#overview)
-         - [Non-goals](#non-goals)
-- [DETAILED DESIGN](#detailed-design)
-   - [Bare bones of scheduling](#bare-bones-of-scheduling)
-   - [Communication and statefulness of plugins](#communication-and-statefulness-of-plugins)
-   - [Plugin registration](#plugin-registration)
-   - [Extension points](#extension-points)
-      - [Scheduling queue sort](#scheduling-queue-sort)
-      - [Pre-filter](#pre-filter)
-      - [Filter](#filter)
-      - [Post-filter](#post-filter)
-      - [Scoring](#scoring)
-      - [Post-scoring/pre-reservation](#post-scoringpre-reservation)
-      - [Reserve](#reserve)
-      - [Permit](#permit)
-         - [Approving a Pod binding](#approving-a-pod-binding)
-      - [Reject](#reject)
-      - [Pre-Bind](#pre-bind)
-      - [Bind](#bind)
-      - [Post Bind](#post-bind)
-- [USE-CASES](#use-cases)
-      - [Dynamic binding of cluster-level resources](#dynamic-binding-of-cluster-level-resources)
-      - [Gang Scheduling](#gang-scheduling)
-- [OUT OF PROCESS PLUGINS](#out-of-process-plugins)
-- [CONFIGURING THE SCHEDULING FRAMEWORK](#configuring-the-scheduling-framework)
-- [BACKWARD COMPATIBILITY WITH SCHEDULER v1](#backward-compatibility-with-scheduler-v1)
-- [DEVELOPMENT PLAN](#development-plan)
-- [TESTING PLAN](#testing-plan)
-- [WORK ESTIMATES  ](#work-estimates)
-
-# SUMMARY 
+<!-- toc -->
+
+* [SUMMARY](#summary)
+* [MOTIVATION](#motivation)
+  * [Goals](#goals)
+  * [Non-Goals](#non-goals)
+* [PROPOSAL](#proposal)
+  * [Scheduling Cycle](#scheduling-cycle)
+  * [Extension points](#extension-points)
+    * [Queue sort](#queue-sort)
+    * [Pre-filter](#pre-filter)
+    * [Filter](#filter)
+    * [Post-filter](#post-filter)
+    * [Scoring](#scoring)
+    * [Normalize scoring](#normalize-scoring)
+    * [Reserve](#reserve)
+    * [Permit](#permit)
+    * [Pre-bind](#pre-bind)
+    * [Bind](#bind)
+    * [Post-bind](#post-bind)
+    * [Un-reserve](#un-reserve)
+  * [Plugin API](#plugin-api)
+    * [PluginContext](#plugincontext)
+    * [PluginHandle](#pluginhandle)
+    * [Plugin Registration](#plugin-registration)
+  * [Plugin Lifecycle](#plugin-lifecycle)
+    * [Initialization](#initialization)
+    * [Concurrency](#concurrency)
+  * [Configuring Plugins](#configuring-plugins)
+    * [Enable/Disable](#enabledisable)
+    * [Change Evaluation Order](#change-evaluation-order)
+    * [Optional Args](#optional-args)
+    * [Backward compatibility](#backward-compatibility)
+  * [Interactions with Cluster Autoscaler](#interactions-with-cluster-autoscaler)
+* [USE CASES](#use-cases)
+  * [Coscheduling](#coscheduling)
+  * [Dynamic Resource Binding](#dynamic-resource-binding)
+  * [Custom Scheduler Plugins (out of tree)](#custom-scheduler-plugins-out-of-tree)
+* [GRADUATION CRITERIA](#graduation-criteria)
+* [IMPLEMENTATION HISTORY](#implementation-history)
+
+<!-- tocstop -->
+
+# SUMMARY
 
 This document describes the Kubernetes Scheduling Framework. The scheduling
-framework implements only basic functionality, but exposes many extension points
-for plugins to expand its functionality. The plan is that this framework (with
-its plugins) will eventually replace the current Kubernetes scheduler.
-
-# OBJECTIVE
-
--  make scheduler more extendable.
--  Make scheduler core simpler by moving some of its features to plugins.
--  Propose extension points in the framework.
--  Propose a mechanism to receive plugin results and continue or abort based
-   on the received results.
--  Propose a mechanism to handle errors and communicate it with plugins.
-
-## Terminology
-
-Scheduler v1, current scheduler: refer to existing scheduler of Kubernetes.  
-Scheduler v2, scheduling framework: refer to the new scheduler proposed in this
-doc. 
-
-# BACKGROUND
-
-Many features are being added to the Kubernetes default scheduler. They keep
-making the code larger and logic more complex. A more complex scheduler is
-harder to maintain, its bugs are harder to find and fix, and those users running
-a custom scheduler have a hard time catching up and integrating new changes.  
-The current Kubernetes scheduler provides
-[webhooks to extend](./scheduler_extender.md)
-its functionality. However, these are limited in a few ways:
-
-1. The number of extension points are limited: "Filter" extenders are called
-   after default predicate functions. "Prioritize" extenders are called after
-   default priority functions. "Preempt" extenders are called after running
-   default preemption mechanism. "Bind" verb of the extenders are used to bind
-   a Pod. Only one of the extenders can be a binding extender, and that
-   extender performs binding instead of the scheduler. Extenders cannot be
-   invoked at other points, for example, they cannot be called before running
-   predicate functions.
-1. Every call to the extenders involves marshaling and unmarshalling JSON.
-   Calling a webhook (HTTP request) is also slower than calling native functions.
-1. It is hard to inform an extender that scheduler has aborted scheduling of
-   a Pod. For example, if an extender provisions a cluster resource and
-   scheduler contacts the extender and asks it to provision an instance of the
-   resource for the Pod being scheduled and then scheduler faces errors
-   scheduling the Pod and decides to abort the scheduling, it will be hard to
-   communicate the error with the extender and ask it to undo the provisioning
-   of the resource.
-1. Since current extenders run as a separate process, they cannot use
-   scheduler's cache. They must either build their own cache from the API
-   server or process only the information they receive from the default scheduler.
+framework is a new set of "plugin" APIs being added to the existing Kubernetes
+Scheduler. Plugins are compiled into the scheduler, and these APIs allow many
+scheduling features to be implemented as plugins, while keeping the scheduling
+"core" simple and maintainable.
+
+*Note: Previous versions of this document proposed replacing the existing
+scheduler with a new implementation.*
+
+# MOTIVATION
+
+Many features are being added to the Kubernetes Scheduler. They keep making the
+code larger and the logic more complex. A more complex scheduler is harder to
+maintain, its bugs are harder to find and fix, and those users running a custom
+scheduler have a hard time catching up and integrating new changes. The current
+Kubernetes scheduler provides [webhooks to extend][] its functionality. However,
+these are limited in a few ways:
+
+[webhooks to extend]: https://github.com/kubernetes/community/blob/master/contributors/design-proposals/scheduling/scheduler_extender.md
+
+1.  The number of extension points are limited: "Filter" extenders are called
+    after default predicate functions. "Prioritize" extenders are called after
+    default priority functions. "Preempt" extenders are called after running
+    default preemption mechanism. "Bind" verb of the extenders are used to bind
+    a Pod. Only one of the extenders can be a binding extender, and that
+    extender performs binding instead of the scheduler. Extenders cannot be
+    invoked at other points, for example, they cannot be called before running
+    predicate functions.
+1.  Every call to the extenders involves marshaling and unmarshalling JSON.
+    Calling a webhook (HTTP request) is also slower than calling native
+    functions.
+1.  It is hard to inform an extender that scheduler has aborted scheduling of a
+    Pod. For example, if an extender provisions a cluster resource and scheduler
+    contacts the extender and asks it to provision an instance of the resource
+    for the pod being scheduled and then scheduler faces errors scheduling the
+    pod and decides to abort the scheduling, it will be hard to communicate the
+    error with the extender and ask it to undo the provisioning of the resource.
+1.  Since current extenders run as a separate process, they cannot use
+    scheduler's cache. They must either build their own cache from the API
+    server or process only the information they receive from the default
+    scheduler.
 
 The above limitations hinder building high performance and versatile scheduler
-extensions. We would ideally like to have an extension mechanism that is fast
-enough to allow keeping a bare minimum logic in the scheduler core and convert
-many of the existing features of default scheduler, such as predicate and
-priority functions and preemption into plugins. Such plugins will be compiled
-with the scheduler. We would also like to provide an extension mechanism that do
-not need recompilation of scheduler. The expected performance of such plugins is
-lower than in-process plugins. Such out-of-process plugins should be used in
-cases where quick invocation of the plugin is not a constraint.
-
-# OVERVIEW
-
-Scheduler v2 allows both built-in and out-of-process extenders. This new
-architecture is a scheduling framework that exposes several extension points
-during a scheduling cycle. Scheduler plugins can register to run at one or more
-extension points.
-
-#### Non-goals
-
--  We will keep Kubernetes API backward compatibility, but keeping scheduler
-   v1 backward compatibility is a non-goal. Particularly, scheduling policy
-   config and v1 extenders won't work in this new framework.
--  Solve all the scheduler v1 limitations, although we would like to ensure
-   that the new framework allows us to address known limitations in the future.
--  Provide implementation details of plugins and call-back functions, such as
-   all of their arguments and return values.
-
-# DETAILED DESIGN
-
-## Bare bones of scheduling
-
-Pods that are not assigned to any node go to a scheduling queue and sorted by
-order specified by plugins (described [here](#scheduling-queue-sort)). The
-scheduling framework picks the head of the queue and starts a **scheduling
-cycle** to schedule the pod. At the end of the cycle scheduler determines
-whether the pod is schedulable or not. If the pod is not schedulable, its status
-is updated and goes back to the scheduling queue. If the pod is schedulable (one
-or more nodes are found that can run the Pod), the scoring process is started.
-The scoring process finds the best node to run the Pod. Once the best node is
-picked, the scheduler updates its cache and then a bind go routine is started to
-bind the pod.  
-The above process is the same as what Kubernetes scheduler v1 does. Some of the
-essential features of scheduler v1, such as leader election, will also be
-transferred to the scheduling framework.  
-In the rest of this section we describe how various plugins are used to enrich
-this basic workflow. This document focuses on in-process plugins.
-Out-of-process plugins are discussed later in a separate doc.
-
-## Communication and statefulness of plugins
-
-The scheduling framework provides a library that plugins can use to pass
-information to other plugins. This library keeps a map from keys of type string
-to opaque pointers of type interface{}. A write operation takes a key and a
-pointer and stores the opaque pointer in the map with the given key. Other
-plugins can provide the key and receive the opaque pointer. Multiple plugins can
-share the state or communicate via this mechanism.  
-The saved state is preserved only during a single scheduling cycle. At the end
-of a scheduling cycle, this map is destructed. So, plugins cannot keep shared
-state across multiple scheduling cycle. They can, however, update the scheduler
-cache via the provided interface of the cache. The cache interface allows
-limited state preservation across multiple scheduling cycle.  
-It is worth noting that plugins are assumed to be **trusted**. Scheduler does
-not prevent one plugin from accessing or modifying another plugin's state.
-
-## Plugin registration
-
-Plugin registration is done by providing an extension point and a function that
-should be called at that extension point. This step will be something like:
+features. We would ideally like to have an extension mechanism that is fast
+enough to allow existing features to be converted into plugins, such as
+predicate and priority functions. Such plugins will be compiled into the
+scheduler binary. Additionally, authors of custom schedulers can compile a
+custom scheduler using (unmodified) scheduler code and their own plugins.
 
-```go
-register("pre-filter", plugin.foo)
-```
+## Goals
+
+-   Make scheduler more extendable.
+-   Make scheduler core simpler by moving some of its features to plugins.
+-   Propose extension points in the framework.
+-   Propose a mechanism to receive plugin results and continue or abort based on
+    the received results.
+-   Propose a mechanism to handle errors and communicate them with plugins.
+
+## Non-Goals
+
+-   Solve all scheduler limitations, although we would like to ensure that the
+    new framework allows us to address known limitations in the future.
+-   Provide implementation details of plugins and call-back functions, such as
+    all of their arguments and return values.
+
+# PROPOSAL
+
+The Scheduling Framework defines new extension points and Go APIs in the
+Kubernetes Scheduler for use by "plugins". Plugins add scheduling behaviors to
+the scheduler, and are included at compile time. The scheduler's ComponentConfig
+will allow plugins to be enabled, disabled, and reordered. Custom schedulers can
+write their plugins "[out-of-tree](#custom-scheduler-plugins-out-of-tree)" and
+compile a scheduler binary with their own plugins included.
 
-The details of the function signature will be provided later.
+## Scheduling Cycle
+
+The main loop of the scheduler is referred to as a "scheduling cycle". Each
+cycle covers the complete process of assigning one pod to a node (or determining
+that the pod cannot be scheduled). Multiple scheduling cycles are started
+serially, but some parts may run concurrently. (See [Concurrency](#concurrency))
 
 ## Extension points
 
-The following picture shows the scheduling cycle of a Pod and the extension
+The following picture shows the scheduling cycle of a pod and the extension
 points that the scheduling framework exposes. In this picture "Filter" is
-equivalent to "Predicate" in scheduler v1 and "Scoring" is equivalent to
-"Priority function". Plugins are go functions. They are registered to be called
-at one of these extension points. They are called by the framework in the same
-order they are registered for each extension point.  
-In the following sections we describe each extension point in the same order
-they are called in a schedule cycle.
+equivalent to "Predicate" and "Scoring" is equivalent to "Priority function".
+Plugins are registered to be called at one or more of these extension points. In
+the following sections we describe each extension point in the same order they
+are called in a scheduling cycle.
+
+One plugin may register at multiple extension points to perform more complex or
+stateful tasks.
 
 ![image](20180409-scheduling-framework-extensions.png)
 
-### Scheduling queue sort
+### Queue sort
 
-These plugins indicate how Pods should be sorted in the scheduling queue. A
-plugin registered at this point only returns greater, smaller, or equal to
-indicate an ordering between two Pods. In other words, a plugin at this
-extension point returns the answer to "less(pod1, pod2)". Multiple plugins may
-be registered at this point. Plugins registered at this point are called in
-order and the invocation continues as long as plugins return "equal". Once a
-plugin returns "greater" or "smaller" the invocation of these plugins are
-stopped.
+These plugins are used to sort pods in the scheduling queue. A queue sort plugin
+essentially will provide a "less(pod1, pod2)" function. Only one queue sort
+plugin may be enabled at a time.
 
 ### Pre-filter
 
-These plugins are generally useful to check certain conditions that the cluster
-or the Pod must meet. These are also useful to perform pre-processing on the pod
-and store some information about the pod that can be used by other plugins.  
-The pod pointer is passed as an argument to these plugins. If any of these
-plugins return an error, the scheduling cycle is aborted.  
-These plugins are called serially in the same order registered.
+These plugins are used to pre-process info about the pod, or to check certain
+conditions that the cluster or the pod must meet. If a pre-filter plugin returns
+an error, the scheduling cycle is aborted. Pre-filter plugins are called
+serially within a scheduling cycle.
 
 ### Filter
 
-Filter plugins filter out nodes that cannot run the Pod. Scheduler runs these
-plugins per node in the same order that they are registered, but scheduler may
-run these filter function for multiple nodes in parallel. So, these plugins must
-use synchronization when they modify state.  
-Scheduler stops running the remaining filter functions for a node once one of
-these filters fails for the node.
+These plugins are used to filter out nodes that cannot run the Pod. For each
+node, the scheduler will call filter plugins in their configured order. If any
+filter plugin marks the node as infeasible, the remaining plugins will not be
+called for that node. Nodes may be evaluated concurrently.
 
 ### Post-filter
 
-The Pod and the set of nodes that can run the Pod are passed to these plugins.
-They are called whether Pod is schedulable or not (whether the set of nodes is
-empty or non-empty).  
-If any of these plugins return an error or if the Pod is determined
-unschedulable, the scheduling cycle is aborted.  
-These plugins are called serially.
+This is an informational extension point. Plugins can will be called with a list
+of nodes that passed the filtering phase. A plugin may use this data to update
+internal state or to generate logs/metrics.
+
+**Note:** Plugins wishing to perform "pre-scoring" work should use the
+post-filter extension point.
 
 ### Scoring
 
-These plugins are similar to priority function in scheduler v1. They are
-utilized to rank nodes that have passed the filtering stage. Similar to Filter
-plugins, these are called per node serially in the same order registered, but
-scheduler may run them for multiple nodes in parallel.  
-Each one of these functions return a score for the given node. The score is
-multiplied by the weight of the function and aggregated with the result of other
-scoring functions to yield a total score for the node.  
-These functions can never block scheduling. In case of an error they should
-return zero for the Node being ranked.
+These plugins are used to rank nodes that have passed the filtering phase. The
+scheduler will call each scoring plugin for each node. There will be a well
+defined range of integers representing the minimum and maximum scores. After the
+[normalize scoring](#normalize-scoring) phase, the scheduler will combine node
+scores from all plugins according to the configured plugin weights.
 
-### Post-scoring/pre-reservation
+If a scoring plugin returns an error, the scheduler will treat it as a zero
+score.
 
-After all scoring plugins are invoked and the score of nodes are determined, the
-framework picks the best node with the highest score and then it calls
-post-scoring plugins. The Pod and the chosen Node are passed to these plugins.
-These plugins have one more chance to check any conditions about the assignment
-of the Pod to this Node and reject the node if needed.
+### Normalize scoring
 
-![image](20180409-scheduling-framework-threads.png)
+These plugins are used to modify scores before the scheduler computes a final
+ranking of Nodes. A plugin that registers for this extension point will be
+called with the [scoring](#scoring) results from the same plugin. This is called
+once per plugin per scheduling cycle.
+
+For example, suppose a plugin `BlinkingLightScorer` ranks Nodes based on how
+many blinking lights they have.
+
+```go
+func ScoreNode(_ *v1.Pod, n *v1.Node) (int, error) {
+   return getBlinkingLightCount(n)
+}
+```
+
+However, the maximum count of blinking lights may be small compared to
+`NodeScoreMax`. To fix this, `BlinkingLightScorer` should also register for this
+extension point.
+
+```go
+func NormalizeScores(scores map[string]int) {
+   highest := 0
+   for _, score := range scores {
+      highest = max(highest, score)
+   }
+   for node, score := range scores {
+      scores[node] = score*NodeScoreMax/highest
+   }
+}
+```
+
+If any normalize-scoring plugin returns an error, the scheduling cycle is
+aborted.
+
+**Note:** Plugins wishing to perform "pre-reserve" work should use the
+normalize-scoring extension point.
 
 ### Reserve
 
-At this point scheduler updates its cache by "reserving" a Node (partially or
-fully) for the Pod. In scheduler v1 this stage is called "assume".
-At this point, only the scheduler cache is updated to
-reflect that the Node is (partially) reserved for the Pod. The scheduling
-framework calls plugins registered at this extension points so that they get a
-chance to perform cache updates or other accounting activities. These plugins
-do not return any value (except errors).
+This is an informational extension point. Plugins which maintain runtime state
+(aka "stateful plugins") should use this extension point to be notified by the
+scheduler when resources on a node are being reserved for a given Pod. This
+happens before the scheduler actually binds the pod to the Node, and it exists
+to prevent race conditions while the scheduler waits for the bind to succeed.
+
+Once a pod is in the reserved state, it will either trigger
+[Un-reserve](#un-reserve) plugins (on failure) or [Post-bind](#post-bind)
+plugins (on success).
 
-The actual assignment of the Node to the Pod happens during the "Bind" phase.
-That is when the API server updates the Pod object with the Node information.
+*Note: This concept used to be referred to as "assume".*
 
 ### Permit
 
-Permit plugins run in a separate go routine (in parallel). Each plugin can return
-one of the three possible values: 1) "permit", 2) "deny", or 3) "wait". If all
-plugins registered at this extension point return "permit", the pod is sent to
-the next step for binding. If any of the plugins returns "deny", the pod is
-rejected and sent back to the scheduling queue. If any of the plugins returns
-"wait", the Pod is kept in reserved state until it is explicitly approved for
-binding. A plugin that returns "wait" must return a "timeout" as well. If the
-timeout expires, the pod is rejected and goes back to the scheduling queue.
+These plugins are used to prevent or delay the binding of a Pod. A permit plugin
+can do one of three things.
 
-#### Approving a Pod binding
+1.  **approve** \
+    Once all permit plugins approve a pod, it is sent for binding.
 
-While any plugin can receive the list of reserved Pod from the cache and approve
-them,  we expect only the "Permit" plugins to approve binding of reserved Pods
-that are in "waiting" state. Once a Pod is approved, it is sent to the Bind
-stage.
+1.  **deny** \
+    If any permit plugin denies a pod, it is returned to the scheduling queue.
+    This will trigger [Un-reserve](#un-reserve) plugins.
 
-### Reject
+1.  **wait** (with a timeout) \
+    If a permit plugin returns "wait", then the pod is kept in the permit phase
+    until a [plugin approves it](#pluginhandle). If a timeout occurs, **wait**
+    becomes **deny** and the pod is returned to the scheduling queue, triggering
+    [un-reserve](#un-reserve) plugins.
 
-Plugins called at "Permit" may perform some operations that should be undone if
-the Pod reservation fails. The "Reject" extension point allows such clean-up
-operations to happen. Plugins registered at this point are called if the
-reservation of the Pod is cancelled. The reservation is cancelled if any of the
-"Permit" plugins returns "reject" or if a Pod reservation, which is in "wait"
-state, times out. 
+**Approving a pod binding**
 
-### Pre-Bind
+While any plugin can receive the list of reserved pod from the cache and approve
+them (see [`PluginHandle`](#pluginhandle)) we expect only the permit plugins to
+approve binding of reserved Pods that are in "waiting" state. Once a pod is
+approved, it is sent to the pre-bind phase.
 
-When a Pod is approved for binding it reaches to this stage. These plugins run
-before the actual binding of the Pod to a Node happens. The binding starts only
-if all of these plugins return true. If any returns false, the Pod is rejected
-and sent back to the scheduling queue. These plugins run in a separate go
-routine. The same go routine runs "Bind" after these plugins when all of them
-return true.
+### Pre-bind
+
+These plugins are used to perform any work required before a pod is bound. For
+example, a pre-bind plugin may provision a network volume and mount it on the
+target node before allowing the pod to run there.
+
+If any pre-bind plugin returns an error, the pod is [rejected](#un-reserve) and
+returned to the scheduling queue.
 
 ### Bind
 
-Once all pre-bind plugins return true, the Bind plugins are executed. Multiple
-plugins may be registered at this extension point. Each plugin may return true
-or false (or an error). If a plugin returns false, the next plugin will be
-called until a plugin returns true. Once a true is returned **the remaining
-plugins are skipped**. If any of the plugins returns an error or all of them
-return false, the Pod is rejected and sent back to the scheduling queue.
-
-### Post Bind
-
-The Post Bind plugins can be useful for housekeeping after a pod is scheduled.
-These plugins do not return any value and are not expected to influence the
-scheduling decision made in the scheduling cycle.
-
-### Informer Events
-
-The scheduling framework, similar to Scheduler v1, will have informers that let
-the framework keep its copy of the state of the cluster up-to-date. The
-informers generate events, such as "PodAdd", "PodUpdate", "PodDelete", etc. The
-framework allows plugins to register their own handlers for any of these events.
-The handlers allow plugins with internal state or caches to keep their state
-updated. 
-
-# USE-CASES
-
-In this section we provide a couple of examples on how the scheduling framework
-can be used to solve common scheduling scenarios.
-
-### Dynamic binding of cluster-level resources
-
-Cluster level resources are resources which are not immediately available on
-nodes at the time of scheduling Pods. Scheduler needs to ensure that such
-cluster level resources are bound to a chosen Node before it can schedule a Pod
-that requires such resources to the Node. We refer to this type of binding of
-resources to Nodes at the time of scheduling Pods as dynamic resource binding.  
-Dynamic resource binding has proven to be a challenge in Scheduler v1, because
-Scheduler v1 is not flexible enough to support various types of plugins at
-different phases of scheduling. As a result, binding of storage volumes is
-integrated in the scheduler code and some non-trivial changes are done to the
-scheduler extender to support dynamic binding of network GPUs.  
-The scheduling framework allows such dynamic bindings in a cleaner way. The main
-thread of scheduling framework process a pending Pod that requests a network
-resource and finds a node for the Pod and reserves the Pod. A dynamic resource
-binder plugin installed at "Pre-Bind" stage is invoked (in a separate thread).
-It analyzes the Pod and when detects that the Pod needs dynamic binding of the
-resource, the plugin tries to attach the cluster resource to the chosen node and
-then returns true so that the Pod can be bound. If the resource attachment
-fails, it returns false and the Pod will be retried.  
-When there are multiple of such network resources, each one of them installs one
-"pre-bind" plugin. Each plugin looks at the Pod and if the Pod is not requesting
-the resource that they are interested in, they simply return "true" for the
-pod.
-
-### Gang Scheduling
-
-Gang scheduling allows a certain number of Pods to be scheduled simultaneously.
-If all the members of the gang cannot be scheduled at the same time, none of
-them should be scheduled. Gang scheduling may have various other features as
-well, but in this context we are interested in simultaneous scheduling of Pods.  
-Gang scheduling in the scheduling framework can be done with an "Permit" plugin.
-The main scheduling thread processes pods one by one and reserves nodes for
-them. The gang scheduling plugin at the Permit stage is invoked for each pod.
-When it finds that the pod belongs to a gang, it checks the properties of the
-gang. If there are not enough members of the gang which are scheduled or in
-"wait" state, the plugin returns "wait". When the number reaches the desired
-value, all the Pods in wait state are approved and sent for binding.
-
-# OUT OF PROCESS PLUGINS
-
-Out of process plugins (OOPP) are called via JSON over an HTTP interface. In
-other words, the scheduler will support webhooks at most (maybe all) of the
-extension points. Data sent to an OOPP must be marshalled to JSON and data
-received must be unmarshalled. So, calling an OOPP is significantly slower than
-in-process plugins.  
-We do not plan to build OOPPs in the first version of the scheduling framework.
-So, more details on them is to be determined.
-
-
-# DEVELOPMENT PLAN
-
-Earlier, we wanted to develop the scheduling framework as an independent project
-from scheduler V1. However, that would need much engineering resources.
-It would also be more difficult to roll out a new and not fully-backward
-compatible scheduler in Kubernetes where tens of thousands of users depend on
-the behavior of the scheduler.
-After revisiting the ideas and challenges, we changed our plan and have decided
-to build some of the ideas of the scheduling framework into Scheduler V1 to make
-it more extendable.
-
-As the first step, we would like to build:
- 1. [Pre-bind](#pre-bind) and [Reserve](#reserve)  plugin points. These will
- help us move our existing cluster resource binding code, such as persistent
- volume binding, to plugins.
- 1. We will also build
- [the plugin communication mechanism](#communication-and-statefulness-of-plugins).
- This will allow us to build more sophisticated plugins that would require
- communication and also help us clean up existing scheduler's code by removing
- existing transient cache data.
- 
-More features of the framework can be added to the Scheduler in the future based
-on the requirements.
-
-<s>
-# CONFIGURING THE SCHEDULING FRAMEWORK
-
-TBD
-
-# BACKWARD COMPATIBILITY WITH SCHEDULER v1
-
-We will build a new set of plugins for scheduler v2 to ensure that the existing
-behavior of scheduler v1 in placing Pods on nodes is preserved. This includes
-building plugins that replicate default predicate and priority functions of
-scheduler v1 and its binding mechanism, but scheduler extenders built for
-scheduler v1 won't be compatible with scheduler v2. Also, predicate and priority
-functions which are not enabled by default (such as service affinity) are not
-guaranteed to exist in scheduler v2.
-
-# DEVELOPMENT PLAN
-
-We will develop the scheduling framework as an incubator project in SIG
-scheduling. It will be built in a separate code-base independently from
-scheduler v1, but we will probably use a lot of code from scheduler v1. 
-
-# TESTING PLAN
-
-We will add unit-tests as we build functionalities of the scheduling framework.
-The scheduling framework should eventually be able to pass integration and e2e
-tests of scheduler v1, excluding those tests that involve scheduler extensions.
-The e2e and integration tests may need to be modified slightly as the
-initialization and configuration of the scheduling framework will be different
-than scheduler v1.
-
-# WORK ESTIMATES
-
-We expect to see an early version of the scheduling framework in two release
-cycles (end of 2018). If things go well, we will start offering it as an
-alternative to the scheduler v1 by the end of Q1 2019 and start the deprecation
-of scheduler v1. We will make it the default scheduler of Kubernetes in Q2 2019,
-but we will keep the option of using scheduler v1 for at least two more release
-cycles.
-</s>
+These plugins are used to bind a pod to a Node. Bind plugins will not be called
+until all pre-bind plugins. Each bind plugin is called in the configured order.
+A bind plugin may choose whether or not to handle the given Pod. If a bind
+plugin chooses to handle a Pod, **the remaining bind plugins are skipped**.
+
+### Post-bind
+
+This is an informational extension point. Post-bind plugins are called after a
+pod is successfully bound. This is the end of a scheduling cycle, and can be
+used to clean up associated resources.
+
+### Un-reserve
+
+This is an informational extension point. If a pod was reserved and then
+rejected in a later phase, then un-reserve plugins will be notified. Un-reserve
+plugins should clean up state associated with the reserved Pod.
+
+Plugins that use this extension point usually should also use
+[Reserve](#reserve).
+
+## Plugin API
+
+There are two steps to the plugin API. First, plugins must register and get
+configured, then they use the extension point interfaces. Extension point
+interfaces have the following form.
+
+```go
+type Plugin interface {
+   Name() string
+}
+
+type QueueSortPlugin interface {
+   Plugin
+   Less(*v1.Pod, *v1.Pod) bool
+}
+
+type PreFilterPlugin interface {
+   Plugin
+   PreFilter(PluginContext, *v1.Pod) error
+}
+
+// ...
+```
+
+### PluginContext
+
+Most* plugin functions will be called with a `PluginContext` argument. A
+`PluginContext` represents the current scheduling cycle.
+
+A `PluginContext` provides read-only APIs for accessing the scheduler's cache of
+cluster state. This is the preferred way for plugins to iterate over nodes,
+iterate over pods on one node, check available resources, and other tasks. The
+scheduler will provide a consistent view of the cluster through these APIs, even
+if the data is a little stale. Since two scheduling cycles can overlap in time,
+plugins should not assume that they will see the same data from two different
+`PluginContext`s.
+
+The `PluginContext` also provides an API similar to
+[`context.WithValue`](https://godoc.org/context#WithValue) that can be used to
+pass data between plugins at different extension points. Multiple plugins can
+share the state or communicate via this mechanism. The state is preserved only
+during a single scheduling cycle. It is worth noting that plugins are assumed to
+be **trusted**. The scheduler does not prevent one plugin from accessing or
+modifying another plugin's state.
+
+\* *The only exception is for [queue sort](#queue-sort) plugins.*
+
+**WARNING**: The data available through a `PluginContext` is not valid after a
+scheduling cycle ends, and plugins should not hold references to that data
+longer than necessary.
+
+### PluginHandle
+
+While the `PluginContext` provides APIs relevant to a single scheduling cycle,
+the `PluginHandle` provides APIs relevant to the lifetime of a plugin.
+Specifically, `PluginHandle` provides a client (`kubernetes.Interface`) and
+`SharedInformerFactory`. The handle will also provide APIs to list and approve
+or reject [waiting pods](#permit).
+
+**WARNING**: `PluginHandle` provides access to both the kubernetes API server
+and the scheduler's internal cache. The two are **not guaranteed to be in sync**
+and extreme care should be taken when writing a plugin that uses data from both
+of them.
+
+Providing plugins access to the API server is necessary to implement useful
+features, especially when those features consume object types that the scheduler
+does not normally consider. Providing a `SharedInformerFactory` allows plugins
+to share caches safely.
+
+### Plugin Registration
+
+Each plugin must define a constructor and add it to the hard-coded registry. For
+more information about constructor args, see [Optional Args](#optional-args).
+
+Example:
+
+```go
+type PluginFactory = func(json.RawMessage, PluginHandle) (Plugin, error)
+
+type Registry map[string]PluginFactory
+
+func NewRegistry() Registry {
+   return Registry{
+      fooplugin.Name: fooplugin.New,
+      barplugin.Name: barplugin.New,
+      // New plugins are registered here.
+   }
+}
+```
+
+It is also possible to add plugins to a `Registry` object and inject that into a
+scheduler. See [Custom Scheduler Plugins](#custom-scheduler-plugins-out-of-tree)
+
+## Plugin Lifecycle
+
+### Initialization
+
+There are two steps to plugin initialization. First,
+[plugins are registered](#plugin-registration). Second, the scheduler uses its
+configuration to decide which plugins to instantiate. If a plugin registers for
+multiple extension points, *it is instantiated only once*.
+
+When a plugin is instantiated, it is passed [config args](#optional-args) and a
+[`PluginHandle`](#pluginhandle).
+
+### Concurrency
+
+There are two types of concurrency that plugin writers should consider. A plugin
+might be invoked several times concurrently when evaluating multiple nodes, and
+a plugin may be called concurrently from *different
+[scheduling cycles](#scheduling-cycle)*.
+
+In the main thread of the scheduler, only one scheduling cycle is processed at a
+time. Any extension point up to and including [reserve](#reserve) will be
+finished before the next scheduling cycle begins*. After the reserve phase, the
+[permit](#permit) and [bind](#bind) phases are executed asynchronously. This
+means that a plugin could be called concurrently from two different scheduling
+cycles, provided that at least one of the calls is to an extension point after
+reserve. Stateful plugins should take care to handle these situations.
+
+Finally, [un-reserve](#un-reserve) plugins may be called from either the Permit
+thread or the Bind thread, depending on how the pod was rejected.
+
+\* *The queue sort extension point is a special case. It is not part of a
+scheduling cycle and may be called concurrently for many pod pairs.*
+
+![image](20180409-scheduling-framework-threads.png)
+
+## Configuring Plugins
+
+The scheduler's component configuration will allow for plugins to be enabled,
+disabled, or otherwise configured. Plugin configuration is separated into two
+parts.
+
+1.  A list of enabled plugins for each extension point (and the order they
+    should run in). If one of these lists is omitted, the default list will be
+    used.
+1.  An optional set of custom plugin arguments for each plugin. Omitting config
+    args for a plugin is equivalent to using the default config for that plugin.
+
+The plugin configuration is organized by extension points. A plugin that
+registers with multiple points must be included in each list.
+
+```go
+type KubeSchedulerConfiguration struct {
+    // ... other fields
+    Plugins      Plugins
+    PluginConfig []PluginConfig
+}
+
+type Plugins struct {
+    QueueSort      []Plugin
+    PreFilter      []Plugin
+    Filter         []Plugin
+    PostFilter     []Plugin
+    Score          []Plugin
+    NormalizeScore []Plugin
+    Reserve        []Plugin
+    Permit         []Plugin
+    PreBind        []Plugin
+    Bind           []Plugin
+    PostBind       []Plugin
+    UnReserve      []Plugin
+}
+
+type Plugin struct {
+    Name   string
+    Weight int // Only valid for Score plugins
+}
+
+type PluginConfig struct {
+    Name string
+    Args json.RawMessage
+}
+```
+
+Example:
+
+```json
+{
+  "plugins": {
+    "preFilter": [
+      {
+        "name": "PluginA"
+      },
+      {
+        "name": "PluginB"
+      },
+      {
+        "name": "PluginC"
+      }
+    ],
+    "score": [
+      {
+        "name": "PluginA",
+        "weight": 30
+      },
+      {
+        "name": "PluginX"
+      },
+      {
+        "name": "PluginY"
+      }
+    ]
+  },
+  "pluginConfig": [
+    {
+      "name": "PluginX",
+      "args": {
+        "favorite_color": "#326CE5",
+        "favorite_number": 7,
+        "thanks_to": "thockin"
+      }
+    }
+  ]
+}
+```
+
+### Enable/Disable
+
+When specified, the list of plugins for a particular extension point are the
+only ones enabled. If an extension point is omitted from the config, then the
+default set of plugins is used for that extension point.
+
+### Change Evaluation Order
+
+When relevant, plugin evaluation order is specified by the order the plugins
+appear in the configuration. A plugin that registers for multiple extension
+points can have different ordering at each extension point.
+
+### Optional Args
+
+Plugins may receive arguments from their config with arbitrary structure.
+Because one plugin may appear in multiple extension points, the config is in a
+separate list of `PluginConfig`.
+
+For example,
+
+```json
+{
+   "name": "ServiceAffinity",
+   "args": {
+      "LabelName": "app",
+      "LabelValue": "mysql"
+   }
+}
+```
+
+```go
+func NewServiceAffinity(args json.RawMessage, h PluginHandle) (Plugin, error) {
+    var config struct {
+        LabelName, LabelValue string
+    }
+    if err := json.Unmarshal(args, &config); err != nil {
+        return nil, errors.Wrap(err, "could not parse args")
+    }
+    //...
+}
+```
+
+### Backward compatibility
+
+The current `KubeSchedulerConfiguration` kind has `apiVersion:
+kubescheduler.config.k8s.io/v1alpha1`. This new config format will be either
+`v1alpha2` or `v1beta1`. When a newer version of the scheduler parses a
+`v1alpha1`, the "policy" section will be used to construct an equivalent plugin
+configuration.
+
+*Note: Moving `KubeSchedulerConfiguration` to `v1` is outside the scope of this
+design, but see also
+https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/0032-create-a-k8s-io-component-repo.md
+and https://github.com/kubernetes/community/pull/3008*
+
+## Interactions with Cluster Autoscaler
+
+TODO
+
+# USE CASES
+
+These are just a few examples of how the scheduling framework can be used.
+
+## Coscheduling
+
+Functionality similar to
+[kube-batch](https://github.com/kubernetes-sigs/kube-batch) (sometimes called
+"gang scheduling") could be implemented as a plugin. For pods in a batch, the
+plugin would "accumulate" pods in the [permit](#permit) phase by using the
+"wait" option. Because the permit stage happens after [reserve](#reserve),
+subsequent pods will be scheduled as if the waiting pod is using those
+resources. Once enough pods from the batch are waiting, they can all be
+approved.
+
+## Dynamic Resource Binding
+
+[Topology-Aware Volume Provisioning](https://kubernetes.io/blog/2018/10/11/topology-aware-volume-provisioning-in-kubernetes/)
+can be (re)implemented as a plugin that registers for [filter](#filter) and
+[pre-bind](#pre-bind) extension points. At the filtering phase, the plugin can
+ensure that the pod will be scheduled in a zone which is capable of provisioning
+the desired volume. Then at the pre-bind phase, the plugin can provision the
+volume before letting scheduler bind the pod.
+
+## Custom Scheduler Plugins (out of tree)
+
+The scheduling framework allows people to write custom, performant scheduler
+features without forking the scheduler's code. To accomplish this, developers
+just need to write their own `main()` wrapper around the scheduler. Because
+plugins must be compiled with the scheduler, writing a wrapper around `main()`
+is necessary in order to avoid modifying code in `vendor/k8s.io/kubernetes`.
+
+```go
+import (
+   "k8s.io/kubernetes/pkg/scheduler/plugins"
+   scheduler "k8s.io/kubernetes/cmd/kube-scheduler/app"
+)
+
+func main() {
+   registry := plugins.NewRegistry()
+   registry.Add("MyPlugin", NewMyPlugin)
+   scheduler.Main(registry)
+}
+```
+
+*Note: The above code is an example, and might not match the implemented API.*
+
+The custom plugin would be enabled in the scheduler config.
+
+```json
+{
+   "name": "MyPlugin"
+}
+```
+
+# GRADUATION CRITERIA
+
+TODO
+
+# IMPLEMENTATION HISTORY
+
+TODO: write down milestones and target releases, and a plan for how we will
+gracefully move to the new system
+
+
+
+
+
+
+