Skip to content

Instantly share code, notes, and snippets.

@nymous
Last active December 27, 2024 22:00
Show Gist options
  • Save nymous/f138c7f06062b7c43c060bf03759c29e to your computer and use it in GitHub Desktop.
Save nymous/f138c7f06062b7c43c060bf03759c29e to your computer and use it in GitHub Desktop.

Revisions

  1. nymous renamed this gist Jan 26, 2023. 1 changed file with 0 additions and 0 deletions.
    File renamed without changes.
  2. nymous revised this gist Jan 26, 2023. 2 changed files with 26 additions and 1 deletion.
    21 changes: 21 additions & 0 deletions LICENSE
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,21 @@
    MIT License

    Copyright (c) 2023 Thomas GAUDIN

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to deal
    in the Software without restriction, including without limitation the rights
    to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
    copies of the Software, and to permit persons to whom the Software is
    furnished to do so, subject to the following conditions:

    The above copyright notice and this permission notice shall be included in all
    copies or substantial portions of the Software.

    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
    OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
    SOFTWARE.
    6 changes: 5 additions & 1 deletion README.md
    Original file line number Diff line number Diff line change
    @@ -36,4 +36,8 @@ The "duration" logged and returned in the header does not encompass the full dur
    If you install [Rich](https://rich.readthedocs.io/) or [`better-exceptions`](https://github.com/Qix-/better-exceptions), you will automagically get pretty formatted exception with context, variable values... Try it and see if you like it

    There are some TODOs left in the code, you can suggest improvements in the comments :smile:
    The Datadog middleware is especially horrible, there must be a better way to add it manually. I also need to open an issue in Starlette comparing in which order middlewares are activated in different languages and frameworks, to try and make it more consistent with other ecosystems.
    The Datadog middleware is especially horrible, there must be a better way to add it manually. I also need to open an issue in Starlette comparing in which order middlewares are activated in different languages and frameworks, to try and make it more consistent with other ecosystems.

    ## License

    This code is available under the [MIT license](#LICENSE).
  3. nymous revised this gist Jun 24, 2022. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions README.md
    Original file line number Diff line number Diff line change
    @@ -7,6 +7,7 @@ Then, you can use Structlog loggers or standard `logging` loggers, and they both
    Requests are assigned a correlation ID with the [`asgi-correlation-id`](https://github.com/snok/asgi-correlation-id) middleware (either captured from incoming request or generated on the fly).
    All logs are linked to the correlation ID, and to the Datadog trace/span if instrumented.
    This data "global to the request" is stored in context vars, and automatically added to all logs produced during the request thanks to Structlog.
    You can add to these "global local variables" at any point in an endpoint with `structlog.contextvars.bind_contextvars(customer_id=123)`. Any log sent after this call will contain this attribute, whether it comes from structlog or logging.

    The duration of the request is also captured, and all technical information about the request (URL, method, client IP/port, status code) are logged as attributes.

  4. nymous revised this gist Jun 24, 2022. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions README.md
    Original file line number Diff line number Diff line change
    @@ -2,6 +2,8 @@

    This logging setup configures [Structlog](https://www.structlog.org/) to output pretty logs in development, and JSON log lines in production.

    Then, you can use Structlog loggers or standard `logging` loggers, and they both will be processed by the Structlog pipeline (see the `hello()` endpoint for reference). That way any log generated by your dependencies will also be processed and enriched, even if they know nothing about Structlog!

    Requests are assigned a correlation ID with the [`asgi-correlation-id`](https://github.com/snok/asgi-correlation-id) middleware (either captured from incoming request or generated on the fly).
    All logs are linked to the correlation ID, and to the Datadog trace/span if instrumented.
    This data "global to the request" is stored in context vars, and automatically added to all logs produced during the request thanks to Structlog.
  5. nymous revised this gist Jun 24, 2022. 1 changed file with 9 additions and 0 deletions.
    9 changes: 9 additions & 0 deletions main.py
    Original file line number Diff line number Diff line change
    @@ -98,6 +98,15 @@ async def logging_middleware(request: Request, call_next) -> Response:

    @app.get("/")
    def hello():
    custom_structlog_logger = structlog.stdlib.get_logger("my.structlog.logger")
    custom_structlog_logger.info("This is an info message from Structlog")
    custom_structlog_logger.warning("This is a warning message from Structlog, with attributes", an_extra="attribute")
    custom_structlog_logger.error("This is an error message from Structlog")

    custom_logging_logger = logging.getLogger("my.logging.logger")
    custom_logging_logger.info("This is an info message from standard logger")
    custom_logging_logger.warning("This is a warning message from standard logger, with attributes", extra={"another_extra": "attribute"})

    return "Hello, World!"


  6. nymous created this gist Jun 24, 2022.
    36 changes: 36 additions & 0 deletions README.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,36 @@
    # Logging setup for FastAPI

    This logging setup configures [Structlog](https://www.structlog.org/) to output pretty logs in development, and JSON log lines in production.

    Requests are assigned a correlation ID with the [`asgi-correlation-id`](https://github.com/snok/asgi-correlation-id) middleware (either captured from incoming request or generated on the fly).
    All logs are linked to the correlation ID, and to the Datadog trace/span if instrumented.
    This data "global to the request" is stored in context vars, and automatically added to all logs produced during the request thanks to Structlog.

    The duration of the request is also captured, and all technical information about the request (URL, method, client IP/port, status code) are logged as attributes.

    In production, Uvicorn startup logs are silenced because they are printed before we have time to configure JSON logging (and log ingestion systems don't like unformatted logs :sweat_smile:).
    Uvicorn standard access logs are disabled, and reimplemented using Structlog (to be formatted in JSON or in the console).

    ## How to run

    Install dependencies with Poetry.

    Run in development:
    ```shell
    uvicorn main:app --reload
    ```
    and in production (to log in JSON format and instrument the code with Datadog):
    ```shell
    LOG_JSON_FORMAT=true DD_TRACE_ENABLED=true ddtrace-run uvicorn main:app --host 0.0.0.0 --log-config uvicorn_disable_logging.json
    ```

    ## Note

    I will add my references and inspirations soon.

    The "duration" logged and returned in the header does not encompass the full duration of the request. The code could be moved to a higher middleware running earlier in the chain, to also capture the duration of "inner" middlewares such as the correlation-id middleware, or others...

    If you install [Rich](https://rich.readthedocs.io/) or [`better-exceptions`](https://github.com/Qix-/better-exceptions), you will automagically get pretty formatted exception with context, variable values... Try it and see if you like it

    There are some TODOs left in the code, you can suggest improvements in the comments :smile:
    The Datadog middleware is especially horrible, there must be a better way to add it manually. I also need to open an issue in Starlette comparing in which order middlewares are activated in different languages and frameworks, to try and make it more consistent with other ecosystems.
    128 changes: 128 additions & 0 deletions custom_logging.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,128 @@
    import logging
    import sys

    import structlog
    from ddtrace import tracer
    from structlog.types import EventDict, Processor


    # https://github.com/hynek/structlog/issues/35#issuecomment-591321744
    def rename_event_key(_, __, event_dict: EventDict) -> EventDict:
    """
    Log entries keep the text message in the `event` field, but Datadog
    uses the `message` field. This processor moves the value from one field to
    the other.
    See https://github.com/hynek/structlog/issues/35#issuecomment-591321744
    """
    event_dict["message"] = event_dict.pop("event")
    return event_dict


    def drop_color_message_key(_, __, event_dict: EventDict) -> EventDict:
    """
    Uvicorn logs the message a second time in the extra `color_message`, but we don't
    need it. This processor drops the key from the event dict if it exists.
    """
    event_dict.pop("color_message", None)
    return event_dict


    def tracer_injection(_, __, event_dict: EventDict) -> EventDict:
    # get correlation ids from current tracer context
    span = tracer.current_span()
    trace_id, span_id = (span.trace_id, span.span_id) if span else (None, None)

    # add ids to structlog event dictionary
    event_dict["dd.trace_id"] = str(trace_id or 0)
    event_dict["dd.span_id"] = str(span_id or 0)

    return event_dict


    def setup_logging(json_logs: bool = False, log_level: str = "INFO"):
    timestamper = structlog.processors.TimeStamper(fmt="iso")

    shared_processors: list[Processor] = [
    structlog.contextvars.merge_contextvars,
    structlog.stdlib.add_logger_name,
    structlog.stdlib.add_log_level,
    structlog.stdlib.PositionalArgumentsFormatter(),
    structlog.stdlib.ExtraAdder(),
    drop_color_message_key,
    tracer_injection,
    timestamper,
    structlog.processors.StackInfoRenderer(),
    ]

    if json_logs:
    # We rename the `event` key to `message` only in JSON logs, as Datadog looks for the
    # `message` key but the pretty ConsoleRenderer looks for `event`
    shared_processors.append(rename_event_key)
    # Format the exception only for JSON logs, as we want to pretty-print them when
    # using the ConsoleRenderer
    shared_processors.append(structlog.processors.format_exc_info)

    structlog.configure(
    processors=shared_processors
    + [
    # Prepare event dict for `ProcessorFormatter`.
    structlog.stdlib.ProcessorFormatter.wrap_for_formatter,
    ],
    logger_factory=structlog.stdlib.LoggerFactory(),
    cache_logger_on_first_use=True,
    )

    log_renderer: structlog.types.Processor
    if json_logs:
    log_renderer = structlog.processors.JSONRenderer()
    else:
    log_renderer = structlog.dev.ConsoleRenderer()

    formatter = structlog.stdlib.ProcessorFormatter(
    # These run ONLY on `logging` entries that do NOT originate within
    # structlog.
    foreign_pre_chain=shared_processors,
    # These run on ALL entries after the pre_chain is done.
    processors=[
    # Remove _record & _from_structlog.
    structlog.stdlib.ProcessorFormatter.remove_processors_meta,
    log_renderer,
    ],
    )

    handler = logging.StreamHandler()
    # Use OUR `ProcessorFormatter` to format all `logging` entries.
    handler.setFormatter(formatter)
    root_logger = logging.getLogger()
    root_logger.addHandler(handler)
    root_logger.setLevel(log_level.upper())

    for _log in ["uvicorn", "uvicorn.error"]:
    # Clear the log handlers for uvicorn loggers, and enable propagation
    # so the messages are caught by our root logger and formatted correctly
    # by structlog
    logging.getLogger(_log).handlers.clear()
    logging.getLogger(_log).propagate = True

    # Since we re-create the access logs ourselves, to add all information
    # in the structured log (see the `logging_middleware` in main.py), we clear
    # the handlers and prevent the logs to propagate to a logger higher up in the
    # hierarchy (effectively rendering them silent).
    logging.getLogger("uvicorn.access").handlers.clear()
    logging.getLogger("uvicorn.access").propagate = False

    def handle_exception(exc_type, exc_value, exc_traceback):
    """
    Log any uncaught exception instead of letting it be printed by Python
    (but leave KeyboardInterrupt untouched to allow users to Ctrl+C to stop)
    See https://stackoverflow.com/a/16993115/3641865
    """
    if issubclass(exc_type, KeyboardInterrupt):
    sys.__excepthook__(exc_type, exc_value, exc_traceback)
    return

    root_logger.error(
    "Uncaught exception", exc_info=(exc_type, exc_value, exc_traceback)
    )

    sys.excepthook = handle_exception
    105 changes: 105 additions & 0 deletions main.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,105 @@
    import os
    import time

    # This line needs to be run before any `ddtrace` import, to avoid sending traces
    # in local dev environment (we don't have a Datadog agent configured locally, so
    # it prints a stacktrace every time it tries to send a trace)
    # TODO: Find a better way to activate Datadog traces?
    os.environ["DD_TRACE_ENABLED"] = os.getenv("DD_TRACE_ENABLED", "false") # noqa

    import structlog
    import uvicorn
    from asgi_correlation_id import CorrelationIdMiddleware
    from asgi_correlation_id.context import correlation_id
    from ddtrace.contrib.asgi.middleware import TraceMiddleware
    from fastapi import FastAPI, Request, Response
    from pydantic import parse_obj_as
    from uvicorn.protocols.utils import get_path_with_query_string

    from custom_logging import setup_logging


    LOG_JSON_FORMAT = parse_obj_as(bool, os.getenv("LOG_JSON_FORMAT", False))
    LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO")
    setup_logging(json_logs=LOG_JSON_FORMAT, log_level=LOG_LEVEL)

    access_logger = structlog.stdlib.get_logger("api.access")

    app = FastAPI(title="Example API", version="1.0.0")


    @app.middleware("http")
    async def logging_middleware(request: Request, call_next) -> Response:
    structlog.contextvars.clear_contextvars()
    # These context vars will be added to all log entries emitted during the request
    request_id = correlation_id.get()
    structlog.contextvars.bind_contextvars(request_id=request_id)

    start_time = time.perf_counter_ns()
    # If the call_next raises an error, we still want to return our own 500 response,
    # so we can add headers to it (process time, request ID...)
    response = Response(status_code=500)
    try:
    response = await call_next(request)
    except Exception:
    # TODO: Validate that we don't swallow exceptions (unit test?)
    structlog.stdlib.get_logger("api.error").exception("Uncaught exception")
    raise
    finally:
    process_time = time.perf_counter_ns() - start_time
    status_code = response.status_code
    url = get_path_with_query_string(request.scope)
    client_host = request.client.host
    client_port = request.client.port
    http_method = request.method
    http_version = request.scope["http_version"]
    # Recreate the Uvicorn access log format, but add all parameters as structured information
    access_logger.info(
    f"""{client_host}:{client_port} - "{http_method} {url} HTTP/{http_version}" {status_code}""",
    http={
    "url": str(request.url),
    "status_code": status_code,
    "method": http_method,
    "request_id": request_id,
    "version": http_version,
    },
    network={"client": {"ip": client_host, "port": client_port}},
    duration=process_time,
    )
    response.headers["X-Process-Time"] = str(process_time / 10 ** 9)
    return response


    # This middleware must be placed after the logging, to populate the context with the request ID
    # NOTE: Why last??
    # Answer: middlewares are applied in the reverse order of when they are added (you can verify this
    # by debugging `app.middleware_stack` and recursively drilling down the `app` property).
    app.add_middleware(CorrelationIdMiddleware)

    # UGLY HACK
    # Datadog's `TraceMiddleware` is applied as the very first middleware in the list, by patching `FastAPI` constructor.
    # Unfortunately that means that it is the innermost middleware, so the trace/span are created last in the middleware
    # chain. Because we want to add the trace_id/span_id in the access log, we need to extract it from the middleware list,
    # put it back as the outermost middleware, and rebuild the middleware stack.
    # TODO: Open an issue in dd-trace-py to ask if it can change its initialization, or if there is an easy way to add the
    # middleware manually, so we can add it later in the chain and have it be the outermost middleware.
    # TODO: Open an issue in Starlette to better explain the order of middlewares
    tracing_middleware = next(
    (m for m in app.user_middleware if m.cls == TraceMiddleware), None
    )
    if tracing_middleware is not None:
    app.user_middleware = [m for m in app.user_middleware if m.cls != TraceMiddleware]
    structlog.stdlib.get_logger("api.datadog_patch").info(
    "Patching Datadog tracing middleware to be the outermost middleware..."
    )
    app.user_middleware.insert(0, tracing_middleware)
    app.middleware_stack = app.build_middleware_stack()


    @app.get("/")
    def hello():
    return "Hello, World!"


    if __name__ == "__main__":
    uvicorn.run(app, host="127.0.0.1", port=8000, log_config=None)
    22 changes: 22 additions & 0 deletions pyproject.toml
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,22 @@
    [tool.poetry]
    name = "example-api"
    version = "0.1.0"
    description = ""
    authors = ["Nymous"]

    [tool.poetry.dependencies]
    python = "^3.10"
    fastapi = "^0.78.0"
    uvicorn = "^0.18.1"
    ddtrace = "^1.2.1"
    structlog = "^21.5.0"
    asgi-correlation-id = "^3.0.0"

    [tool.poetry.dev-dependencies]
    black = "^22.3.0"
    isort = "^5.10.1"
    mypy = "^0.961"

    [build-system]
    requires = ["poetry-core>=1.0.0"]
    build-backend = "poetry.core.masonry.api"
    40 changes: 40 additions & 0 deletions uvicorn_disable_logging.json
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,40 @@
    {
    "version": 1,
    "disable_existing_loggers": false,
    "formatters": {
    "default": {
    "()": "uvicorn.logging.DefaultFormatter",
    "format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
    },
    "access": {
    "()": "uvicorn.logging.AccessFormatter",
    "format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
    }
    },
    "handlers": {
    "default": {
    "formatter": "default",
    "class": "logging.NullHandler"
    },
    "access": {
    "formatter": "access",
    "class": "logging.NullHandler"
    }
    },
    "loggers": {
    "uvicorn.error": {
    "level": "INFO",
    "handlers": [
    "default"
    ],
    "propagate": false
    },
    "uvicorn.access": {
    "level": "INFO",
    "handlers": [
    "access"
    ],
    "propagate": false
    }
    }
    }