Skip to content

vincentiusmartin/trace-edit

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Trace Editor

To run, access the trace-editor py in the root directory.
Please use the correct input for now, I haven't put any advanced validation.

Before running, create 2 symlinks/folders inside this directory:
./in: contains all input files
./out: contains all output files

The scripts will take every input and produce every output to those directories.

Please keep in mind that every trace must be preprocessed first before getting into script's another functionalities.

List of commands:

1. Preprocess a trace or traces inside a directory.
Type of traces:

    1. Microsoft Server Trace
      BlkReplay's blktrace
      Unix's blktrace: in our case, so far it is the same with Hadoop trace
  • python trace-editor.py -file <tracename> -preprocessMSTrace (-filter read/write)
    python trace-editor.py -file <tracename> -preprocessBlkReplayTrace (-filter read/write)
    python trace-editor.py -file <tracename> -preprocessUnixBlkTrace (-filter read/write)

    It can also preprocess all traces inside a directory, here's an example using MS-Trace

    python trace-editor.py -dir <dirname> -preprocessMSTrace (-filter read/write)

    2. Modify a trace (Precondition: The trace must has been preprocessed)
    Resize all requests size by 2x and rerate all request arrival time by 0.5x :

    python trace-editor.py -file <tracename> -resize 2 -rerate 0.5

    3. Combine traces (Precondition: The traces must have been preprocessed).
    Make sure that the traces' names are well ordered because the script will just do the process without ordering the traces. Well ordered means the traces are ordered from the earliest time to the latest time. Just check this condition with -ls.

    python trace-editor.py -dir <dirname> -combine

    4. Break to RAID-0 disks In this example get RAID disks from 4 disks with the stripe unit size 65536 bytes

    python trace-editor.py -breaktoraid -file <infile> -ndisk 4 -stripe 65536

    5. Check IO imbalance in the RAID Disks. This example uses 3disks with the granularity of 5minutes.

    python trace-editor.py -ioimbalance -file <filename> -granularity 5

    6. Check the busiest or the most loaded (in kB) time for a specific disk in a directory
    Busiest = a time range with the largest number of requests
    Most Loaded = a time range with the largest total requests size

    Notes:
    duration - in hrs, in this example 1hrs (60mins)
    top - top n result in this example 3 top results

    python trace-editor.py -dir <dirname> -mostLoaded -duration 60 -top 3
    python trace-editor.py -dir <dirname> -busiest -duration 60 -top 3

    Check the largest average time, the usage is the same with busiest and most loaded

    python trace-editor.py -dir <dirname> -busiest -duration 60 -top 3

    7. Top Large IO, In this example:
    Top 3 Large IO with size greater than or equal 64kB, with 1hr duration

    python trace-editor.py -toplargeio -file <filename> -offset 64 -devno 0 -duration 60 -top 3

    8. Find most random write time range, In this example:
    Find a time range(min) where has most random write

    python trace-editor.py -dir <dirname> -mostRandomWrite -duration 5 -devno 5 -top 3

    9. Get characteristic info from a after-preprocessed trace(usually after you cut the original preprocessed trace, due to devno reason), In this example:
    You can get something like whisker plot info about write size, read size, time density, and % write, % read, % random write

    python trace-editor.py -dir <dirname> -characteristic

    10. Cut trace, in this example between timerange of minute 5 and minute 10

    python trace-editor.py -cuttrace -file  -timerange 5 10

    About

    No description, website, or topics provided.

    Resources

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published

    Languages

    • Python 100.0%