Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cirqflow] Quantum runtime skeleton - part 3 #4584

Merged
merged 4 commits into from
Oct 25, 2021

Conversation

mpharrigan
Copy link
Collaborator

This contains a skeleton of an execution loop that consumes QuantumExecutable and saves ExecutableResult

@google-cla google-cla bot added the cla: yes Makes googlebot stop complaining. label Oct 18, 2021
@CirqBot CirqBot added the size: L 250< lines changed <1000 label Oct 18, 2021
@mpharrigan mpharrigan marked this pull request as ready for review October 18, 2021 21:07
@mpharrigan mpharrigan requested review from cduck, vtomole, wcourtney and a team as code owners October 18, 2021 21:07
def execute(
rt_config: QuantumRuntimeConfiguration,
executable_group: QuantumExecutableGroup,
base_data_dir: str = ".",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Could we call this checkpoint_dir ?

Also: Why does this default to your current directory ? I would've thought we default to no checkpointing unless a user explicitly requests to do so by providing a path to store the data.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is a question of intent, but I imagine the users of this workflow infrastructure will always be saving their data. I would like to encourage that data always makes a trip to persistent storage before analysis and plotting to enforce separation of data generation from data processing. I've considered removing the return value from execute so users have to do it this way; but that's probably too drastic.

As such, the files aren't checkpoints(*) but rather the main point of the execute function.

(*) my view of checkpoint files is that they are a way of restarting a long computation from a known place, which is not the intent of the files written here.

Args:
rt_config: The `cg.QuantumRuntimeConfiguration` specifying how to execute
`executable_group`.
executable_group: The `QuantumExecutableGroup` containing the executables to execute.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: missing "cg."

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +167 to +170
if rt_config.run_id is None:
run_id = str(uuid.uuid4())
else:
run_id = rt_config.run_id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does one identify runs if they lose the code after runtime ? Should we have a way to allow a user to specify a name they prefer ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you specify a run_id in the rt_config: QuantumRuntimeConfiguration it uses that.

Otherwise, you can look at your filesystem as the directory will be named according to the run_id.

Comment on lines +191 to +211
for i, exe in enumerate(executable_group):
runtime_info = RuntimeInfo(execution_index=i)

if exe.params != tuple():
raise NotImplementedError("Circuit params are not yet supported.")

circuit = exe.circuit

if not hasattr(exe.measurement, 'n_repetitions'):
raise NotImplementedError("Only `BitstringsMeasurement` are supported.")

sampler_run_result = sampler.run(circuit, repetitions=exe.measurement.n_repetitions)

exe_result = ExecutableResult(
spec=exe.spec,
runtime_info=runtime_info,
raw_data=sampler_run_result,
)
cirq.to_json_gzip(exe_result, f'{base_data_dir}/{run_id}/ExecutableResult.{i}.json.gz')
exegroup_result.executable_results.append(exe_result)
print(f'\r{i+1} / {n_executables}', end='', flush=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we expect this loop to grow in complexity the number of features/functionality that these Executables take on, do you think it might be worth structuring things a little more carefully. In this case something like:
allowing a user to provide a list of callbacks that we guarantee to call at the end of this loop iteration. These callbacks could include things like logging, saving, timing, or any other kind of unknown functionality one might want to put after each iteration here ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, yes, and yes.

The QuantumRuntimeConfiguration will learn new fields whose objects' methods serve as callbacks. Logging for sure will be factored out into a callback-like object.

My plan is to get this skeleton merged. The addition of callbacks with defaults and helper functions can be done stepwise and in a backwards-compatible manner.

shared_runtime_info=SharedRuntimeInfo(run_id=run_id),
executable_results=list(),
)
cirq.to_json_gzip(exegroup_result, f'{base_data_dir}/{run_id}/ExecutableGroupResult.json.gz')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GroupResult only gets written once when it is empty ? Do we want another one after it's been filled ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this is weird. Let me try to write up the motivation for the current structure and think about how to make it less weird

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExecutableGroupResult

Here's the motivation.

Copy link
Collaborator Author

@mpharrigan mpharrigan Oct 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My proposed selection is to call cirq.to_json on the constituent parts of ExecutableGroupResult rather than ever saving the whole ExecutableGroupResult object. This is because of the different "lifetimes" of the constituent objects.

In this case, I'd like to introduce a bookkeeping dataclass

@dataclasses.dataclass
class ExecutableGroupFilesystemRecord:
    runtime_configuration_fn: str
    shared_runtime_info_fn: str
    executable_result_fns: List[str]

    def load(self):
        return ExecutableGroupResult(
            runtime_configuration=cirq.read_json(self.runtime_configuration_fn),
            shared_runtime_info=cirq.read_json(self.shared_runtime_info_fn),
            executable_results=[cirq.read_json(exe_fn) for exe_fn in self.executable_result_fns],
        )

to keep things under control.

Would you like me to implement that in the current PR or as a follow-on PR? It might be easier to review if this is kept in its own follow-on PR.

@mpharrigan
Copy link
Collaborator Author

@MichaelBroughton PTAL

@mpharrigan mpharrigan added the automerge Tells CirqBot to sync and merge this PR. (If it's running.) label Oct 25, 2021
@CirqBot CirqBot added the front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. label Oct 25, 2021
@MichaelBroughton MichaelBroughton removed the front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. label Oct 25, 2021
@CirqBot CirqBot added the front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. label Oct 25, 2021
@CirqBot CirqBot merged commit 954a7b6 into quantumlib:master Oct 25, 2021
@CirqBot CirqBot removed automerge Tells CirqBot to sync and merge this PR. (If it's running.) front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. labels Oct 25, 2021
CirqBot pushed a commit that referenced this pull request Nov 1, 2021
Following #4584 comment.

![ExecutableGroupResult](https://user-images.githubusercontent.com/4967059/138186981-ef4c82fc-6f05-400e-a761-d9a9f0b3257d.png)

- Update the constituent parts only when necessary to avoid re-writing the full `ExecutableGroupResult` when only a part has changed. Re-writing would cause a performance hit and potential corruption risk.
- Add a new dataclass to keep track of filenames to make loading in the data easier; see the test.
rht pushed a commit to rht/Cirq that referenced this pull request May 1, 2023
This contains a skeleton of an execution loop that consumes `QuantumExecutable` and saves `ExecutableResult`
rht pushed a commit to rht/Cirq that referenced this pull request May 1, 2023
Following quantumlib#4584 comment.

![ExecutableGroupResult](https://user-images.githubusercontent.com/4967059/138186981-ef4c82fc-6f05-400e-a761-d9a9f0b3257d.png)

- Update the constituent parts only when necessary to avoid re-writing the full `ExecutableGroupResult` when only a part has changed. Re-writing would cause a performance hit and potential corruption risk.
- Add a new dataclass to keep track of filenames to make loading in the data easier; see the test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/workflow cla: yes Makes googlebot stop complaining. size: L 250< lines changed <1000
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants