Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandocBiblioCompiler loads multiple bib files by glob #943

Merged
merged 1 commit into from
Jul 11, 2022
Merged

pandocBiblioCompiler loads multiple bib files by glob #943

merged 1 commit into from
Jul 11, 2022

Conversation

L-TChen
Copy link
Contributor

@L-TChen L-TChen commented Jul 7, 2022

pandocBiblioCompiler is allowed to take multiple bib files by glob. It should be backwards-compatible.

@Minoru
Copy link
Collaborator

Minoru commented Jul 7, 2022

Thanks! Can you please update the doc as well? Right now, nothing even hints at the fact that bibFileName can be a glob.

@L-TChen
Copy link
Contributor Author

L-TChen commented Jul 8, 2022

Thanks! Can you please update the doc as well? Right now, nothing even hints at the fact that bibFileName can be a glob.

Done!

@Minoru
Copy link
Collaborator

Minoru commented Jul 8, 2022

Sorry, it just occurred to me that nix-like OSes allow asterisk and question mark in filenames, so this won't be backwards-compatible. How about adding pandocBibliosCompiler instead? We already have a this duality for other functions, so it shouldn't look that bad I think.

@L-TChen
Copy link
Contributor Author

L-TChen commented Jul 9, 2022

How does the existing pandocBiblioCompiler work in nix-like OSes if asterisk is used in filenames? readPandocBiblio, which only accept one bib file, is called by pandocBiblioCompiler, so it appears to me that it may not be handled correctly already.

@Minoru
Copy link
Collaborator

Minoru commented Jul 9, 2022

How does the existing pandocBiblioCompiler work in nix-like OSes if asterisk is used in filenames?

Should work fine, because it doesn't attempt to interpret the filename as a glob. I'm going AFK now, but will try to test this later today.

readPandocBiblio, which only accept one csl file, is called by pandocBiblioCompiler

Perhaps it's just a typo, but in case it's not: the problem I'm describing is not about CSL files, it's about biblio files. Imagine an existing user who has a call like this:

pandocBiblioCompiller "style.csl" "refs*.bib"

Assume also that they have files "style.csl", "refs*.bib", and "refs1.bib" on the filesystem, and match calls for them in Hakyll.

With the current version of pandocBiblioCompiler, Pandoc will get the style "style.csl" and one reference file "refs*.bib". With the proposed change, it will get the same style but two reference files, "refs*.bib" and "refs1.bib". Again, I need to test this, but I think if e.g. the same reference appears in both files, then the result might change.

This is all very contrived, and in need of testing, but even if there is no problem I'd still prefer a separate function because we already have duplicates for read... and process... functions.

@Minoru
Copy link
Collaborator

Minoru commented Jul 10, 2022

I managed to demonstrate the problem I describe above. Here's a patch to current master (8ce2973):

diff --git a/tests/Hakyll/Web/Pandoc/Biblio/Tests.hs b/tests/Hakyll/Web/Pandoc/Biblio/Tests.hs
index ab78d5b..30f988a 100644
--- a/tests/Hakyll/Web/Pandoc/Biblio/Tests.hs
+++ b/tests/Hakyll/Web/Pandoc/Biblio/Tests.hs
@@ -27,6 +27,7 @@ tests = testGroup "Hakyll.Web.Pandoc.Biblio.Tests" $
     [ goldenTest01
     , goldenTest02
     , goldenTest03
+    , goldenTest04
     ]
 
 --------------------------------------------------------------------------------
@@ -134,3 +135,30 @@ goldenTest03 =
             cleanTestEnv
 
             return output)
+
+goldenTest04 :: TestTree
+goldenTest04 =
+    goldenVsString
+        "biblio01"
+        (goldenTestsDataDir </> "cites-meijer.golden")
+        (do
+            -- Code lifted from https://github.com/jaspervdj/hakyll-citeproc-example.
+            logger <- Logger.new Logger.Error
+            let config = testConfiguration { providerDirectory = goldenTestsDataDir }
+            _ <- run RunModeNormal config logger $ do
+                match "default.html" $ compile templateCompiler
+                match "chicago.csl" $ compile cslCompiler
+                -- Note: this compiles *both* refs1.bib and refs*.bib
+                match "refs*.bib"    $ compile biblioCompiler
+                match "page.markdown" $ do
+                    route $ setExtension "html"
+                    compile $
+                        pandocBiblioCompiler "chicago.csl" "refs*.bib" >>=
+                        loadAndApplyTemplate "default.html" defaultContext
+
+            output <- fmap LBS.fromStrict $ B.readFile $
+                    destinationDirectory testConfiguration </> "page.html"
+
+            cleanTestEnv
+
+            return output)
diff --git a/tests/data/biblio/refs*.bib b/tests/data/biblio/refs*.bib
new file mode 100644
index 0000000..e4cd89f
--- /dev/null
+++ b/tests/data/biblio/refs*.bib
@@ -0,0 +1,8 @@
+@inproceedings{meijer1991functional,
+  title={Functional programming with bananas, lenses, envelopes and barbed wire},
+  author={Meijer, Erik and Fokkinga, Maarten and Paterson, Ross},
+  booktitle={Conference on Functional Programming Languages and Computer Architecture},
+  pages={124--144},
+  year={1991},
+  organization={Springer}
+}
diff --git a/tests/data/biblio/refs1.bib b/tests/data/biblio/refs1.bib
new file mode 100644
index 0000000..d7085b5
--- /dev/null
+++ b/tests/data/biblio/refs1.bib
@@ -0,0 +1,8 @@
+@inproceedings{meijer1991functional,
+  title={One},
+  author={Meijer, Erik and Fokkinga, Maarten and Paterson, Ross},
+  booktitle={Conference on Functional Programming Languages and Computer Architecture},
+  pages={124--144},
+  year={1991},
+  organization={Springer}
+}

This new test passes on Linux. When I add the patch from this PR, the test fails:

    biblio01:              FAIL
      Test output was different from 'tests/data/biblio/cites-meijer.golden'. It was:
      <!doctype html>
      <html lang="en">
          <head>
              <meta charset="utf-8">
              <title>This page cites a paper.</title>
          </head>
          <body>
              <h1>This page cites a paper.</h1>
              <p>I would like to cite one of my favourite papers <span class="citation" data-cites="meijer1991functional">(Meijer, Fokkinga, and Paterson 1991)</span> here.</p>
      <div id="refs" class="references csl-bib-body hanging-indent" role="doc-bibliography">
      <div id="ref-meijer1991functional" class="csl-entry" role="doc-biblioentry">
      Meijer, Erik, Maarten Fokkinga, and Ross Paterson. 1991. <span>“One.”</span> In <em>Conference on Functional Programming Languages and Computer Architecture</em>, 124–44. Springer.
      </div>
      </div>
          </body>
      </html>

      Use -p '$0=="Hakyll.Hakyll.Web.Pandoc.Biblio.Tests.biblio01"' to rerun this test only.

This happens because it picks a reference from "refs1.bib" rather than the intended "refs*.bib".

@L-TChen
Copy link
Contributor Author

L-TChen commented Jul 11, 2022

Right, thanks for your feedback. I have defined another pandocBibliosCompiler as you suggested.

@Minoru
Copy link
Collaborator

Minoru commented Jul 11, 2022

Great, thank you very much!

@Minoru Minoru merged commit 1645d9c into jaspervdj:master Jul 11, 2022
@L-TChen L-TChen deleted the multi-bib-by-default branch July 20, 2022 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants