Add datadog certifier #2366

robert-cronin · 2024-12-12T05:13:46Z

Description of the PR

I am not sure if there is a need for a parser or attestation since were just ingesting CertifyBad for a particular pURL, but if there is a need to represent the source information in a predicate, I'd be happy to try and figure out how to add that in.

PR Checklist

All commits have a Developer Certificate of Origin (DCO) -- they are generated using -s flag to git commit.
All new changes are covered by tests
If GraphQL schema is changed, make generate has been run
If GraphQL schema is changed, GraphQL client updates/additions have been made
If OpenAPI spec is changed, make generate has been run
If ent schema is changed, make generate has been run
If collectsub protobuf has been changed, make proto has been run
All CI checks are passing (tests and formatting)
All dependent PRs have already been merged

funnelfiasco · 2024-12-12T14:47:11Z

As a general comment, I wonder if we want to call it something more specific than "DataDog"? "DataDog Malicious Packages DataSet" is unwieldy, but I'm concerned that there might be some future thing that pulls from DataDog proper and the name is already taken. I don't have any great ideas and this may not be a concern worth worrying about right now, but I wanted to raise it.

pkg/certifier/datadog/datadog.go

robert-cronin · 2024-12-13T01:00:01Z

As a general comment, I wonder if we want to call it something more specific than "DataDog"? "DataDog Malicious Packages DataSet" is unwieldy, but I'm concerned that there might be some future thing that pulls from DataDog proper and the name is already taken. I don't have any great ideas and this may not be a concern worth worrying about right now, but I wanted to raise it.

yeah, that is a solid point, if DataDog eventually spin out other datasets, I can see how that might cause some confusion. The data itself mostly comes from GuardDog but I think not exclusively. Maybe we can go with something like datadog-malware-dataset or datadog-mspd but mspd is not a known acronym. The alternative is datadog-malicious-software-packages-dataset but like you said that is a bit unwieldy.
The datadog-malware-dataset one sounds like the best compromise to me between clarity and brevity.

Signed-off-by: robert-cronin <robert.owen.cronin@gmail.com>

pxp928 · 2024-12-19T00:43:09Z

Thanks @robert-cronin! Sorry for the delay. We will review this soon!

robert-cronin · 2024-12-19T01:36:19Z

Thanks @robert-cronin! Sorry for the delay. We will review this soon!

No problems, thanks @pxp928!

lumjjb

This is a super cool addition. I wasn't aware of this dataset but this was a really cool implementation and its such a good example on how to add another data source easily (or at least you made it look easy! - any feedback on how to make this easier would be super great as well, or any particular frictions you had). Thanks so much for yet another great contribution! 🙌

lumjjb · 2025-01-10T21:44:44Z

pkg/certifier/datadog_malware/datadog_malware.go

+		opt(d)
+	}
+
+	if err := d.fetchManifests(); err != nil {


it looks like manifests are fetched once on initialization. Given the database will be updated regularly - it would be helpful to refresh the manifests based on some frequency. Is this feasible?

lumjjb · 2025-01-10T21:47:48Z

pkg/certifier/datadog_malware/datadog_malware.go

+			if pkgInput.Namespace != nil && *pkgInput.Namespace != "" {
+				namespace := strings.TrimPrefix(*pkgInput.Namespace, "@")
+				namespace = strings.TrimPrefix(namespace, "%40")
+				fullName = "@" + namespace + "/" + pkgInput.Name


it looks like it isn't always the case that the packages start with "@" in the dataset, could we add a check here after the trim to see if the namespace had the prefix? and add the "@" only if there was a prefix trim?

lumjjb · 2025-01-10T21:57:15Z

pkg/certifier/datadog_malware/datadog_malware.go

+}
+
+// NewDatadogMalwareCertifier initializes the Datadog Malicious Software Packages certifier
+func NewDatadogMalwareCertifier(ctx context.Context, assemblerFunc assemblerFuncType, opts ...CertifierOption) (certifier.Certifier, error) {


Can you add a bit of documentation here on what the datadog malicious software packages are, for those that are not familiar.

In addition could you add some details on:

The added predicates on the graph

The recommended interval times (considering that the current certifier will generate a certifyBad each time indefinitely).

Any caveats: see comment on periodic fetching of manifest.

robert-cronin requested a review from jeffmendoza as a code owner December 12, 2024 05:13

pull-request-size bot added the size/XL label Dec 12, 2024

funnelfiasco reviewed Dec 12, 2024

View reviewed changes

pkg/certifier/datadog/datadog.go Outdated Show resolved Hide resolved

funnelfiasco mentioned this pull request Dec 13, 2024

Clarify meaning of empty version list DataDog/malicious-software-packages-dataset#135

Closed

robert-cronin force-pushed the feat/datadog-certifier branch from cd71306 to 6be472b Compare December 16, 2024 02:11

Add datadog certifier

d2f86e2

Signed-off-by: robert-cronin <robert.owen.cronin@gmail.com>

robert-cronin force-pushed the feat/datadog-certifier branch from 6be472b to d2f86e2 Compare December 16, 2024 03:14

robert-cronin requested a review from funnelfiasco December 16, 2024 03:14

pxp928 added the needs-review Needs writer LGTM label Jan 6, 2025

lumjjb reviewed Jan 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add datadog certifier #2366

Add datadog certifier #2366

robert-cronin commented Dec 12, 2024 •

edited

Loading

funnelfiasco commented Dec 12, 2024

robert-cronin commented Dec 13, 2024 •

edited

Loading

pxp928 commented Dec 19, 2024

robert-cronin commented Dec 19, 2024

lumjjb left a comment

lumjjb Jan 10, 2025

lumjjb Jan 10, 2025

lumjjb Jan 10, 2025

Add datadog certifier #2366

Are you sure you want to change the base?

Add datadog certifier #2366

Conversation

robert-cronin commented Dec 12, 2024 • edited Loading

Description of the PR

PR Checklist

funnelfiasco commented Dec 12, 2024

robert-cronin commented Dec 13, 2024 • edited Loading

pxp928 commented Dec 19, 2024

robert-cronin commented Dec 19, 2024

lumjjb left a comment

Choose a reason for hiding this comment

lumjjb Jan 10, 2025

Choose a reason for hiding this comment

lumjjb Jan 10, 2025

Choose a reason for hiding this comment

lumjjb Jan 10, 2025

Choose a reason for hiding this comment

robert-cronin commented Dec 12, 2024 •

edited

Loading

robert-cronin commented Dec 13, 2024 •

edited

Loading