-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider alternative transports/formats/markup for TAP #1
Comments
The title of this ticket is very confusing to me, I had no idea what it was about until I read the contents. It may be that you're using jargon from some other testing system that I'm not familiar with.
Both have merits. I think a specified format would be the more useful of the two, but being able to do both may be worthwhile
Maybe. YAML is not easy to parse, and even the YAMLish subset TAP::Harness uses isn't easy. JSON is easier, but less friendly for humans for the sort of data I'd expect.
I'm not sure I understand your question. Do you mean "Should we standardize multiple formats"? |
@Leont Perhaps more helpfully I was using the following terms (probably not helpfully so):
What I was getting at was that we have the potential to redefine how the format is serialised over the wire, this change could also contain the YAML format within the same serialisation. So for example if tap were serializable with JSON the YAML subdocument may just be a key within the test something like this:
The main opposition I have to all these changes however is that I feel it moves away from TAP's simplicity, the YAML currently is treated as dumping ground and simplistic consumers can be written by new comer programmers in under an hour. However there is clearly a need for such extensions provided by TAP/Y and so on. To your final question. Yes, I think there could be some merit to having multiple formats for TAP. |
That sounds more like the presentation layer to me.
I'm not sure there's much value in this. What are you trying to achieve?
I would disagree. One of the most valuable traits of TAP is that it's easy to parse. Multiple formats complicate writing a universal parser, what would the benefit be? |
@Leont I'm aware it isn't the transport layer I was suggesting that every format containing another format from the people I know colloquially call it a transport. It is all the presentation layer and when talking about layers above the OSI stack then it can be helpful (yet confusing obviously). Anyway this isn't the important part of my message and it's not really productive squabbling over slang language within the development community :D. The points I was addressing mostly were:
I'm not necessarily for/against the progressing of the language however TAP-Y/J is an alternative that has sparked a discussion on the main test anything repo before. Their motivations I don't completely understand however certainly I can see a few advantages to this:
The disadvantages I see mostly are:
The idea of us specifying multiple formats as you say does cause issues with writing a universal parser and that is my inherent issue with doing so, again this isn't a ticket to suggest this is a good thing to do and more to test the waters and ask the question (not on a mailing list that very few appear to use). However if we can get the same meta data specified in the YAML better I think the issue mostly goes away. I was suggesting thrift and others as may solve some of the issues raised by TAP-Y/J |
I call that argument the SOAP argument, I find it fairly unconvincing. Given how trivial TAP is to produce, I can't take it serious on that side of the equation. On the consumer side it might make more sense, but I invite anyone who really believes this to write a TAP-J parser in C, or a TAP-Y parser in shell.
I think we have broad consensus that this is desirable. We can achieve that by embedding such a format into TAP, though to be honest I don't believe it to be particularly useful on successful tests. By large, it's the failing tests that you want the additional information on, and in most test-bases they are rare.
One underappreciated aspect of TAP is that it is human-reader friendly. JSON document per line is very much failing that.
The biggest test-base I'm working with (perl5) has some 730000 tests in 2400 files. Assembling, outputting and parsing the sort of extra information that TAP-J/Y suggests would likely have a significant impact on performance. |
I think the point of chosing widely used formats for TAP-Y/J is that you don't have to write a parser from scratch. You use an existing JSON/YAML library and do a small amount of work on top of that. Because of this, I would say that writing a TAP-Y/J parser is roughly the same amount of work as writing a TAP(12) parser. However, you also get the benefit of already having your results in a very interoperable format, so, for example, you could just send them straight to CouchDB to store them.
I would guess that IO operations are the bottleneck, not parsing. Using a TAP parser vs a fast JSON parser probably isn't going to make much of a difference (just my speculation, a few benchmarks would let us know one way or the other). TAP-Y/J also doesn't have to contain too much data, e.g.,
for a passing test isn't that bad. It sounds like everyone agrees that we need more information (at least on failing tests) than what is provided by TAP(12). I'm not completely sold on TAP-Y/J as specified in the tapout repo @jonathanKingston linked to above. I would like to simplify it somewhat. But the main advantages I see for it over TAP13 are:
But I can also see the benefits of the simplicity of TAP(12). Maybe the solution is to leave TAP alone and develop TAP-Y/J as a separate standard. |
As @jcelliott hinted, that's unlikely to be your bottleneck, unless you're running on some embedded device or something (here are some numbers for JSON parsing in Python 2.7). However, I do think that scrapping TAP(12/13) for TAP-Y/J isn't the best course of action either. I use TAP-Y/J because I use it for integration testing and I need to know why/where a test failed. In a basic unittest, this is completely unnecessary, and a simpler format is preferred. It's very nice to keep all tests in a common format, so it's nice to use an extensible format.
You don't need to write a TAP-J or TAP-Y parser, just a generator, which can as easy as a format string. Use an available TAP parser for test reporting and you're golden. Even then, there are JSON libraries in C and bash, and most other languages (see bottom of page) have JSON libraries readily available, if not included in the standard library. For what it's worth, I wrote a TAP-Y/J/(12) parser in Go, and it really didn't take too much time using the libraries readily available.
Sure, if TAP-Y/J is the way to go. I'm more for creating TAP(14) by:
|
The thing I am seeing about TAP-Y is that it creates nested testing by implying that there is a structure of:
It adds a whole heap of opinionated syntax around required semantics which as much as I like to be strict, is not always readily available to the producer. For example the number of test run is not always easily obtainable when spanning multiple systems (Writing non valid TAP-Y format to pass through to something which aggregates the numbers is ultimately defeating the point of why there is a count). I feel as if we can cover the nesting within the subtest ticket we have open; meta data around the test type may be useful. However as @Leont mentioned I feel the most useful comes from meta data around test failures. |
One of the more interesting uses of TAP I know of is Tapper by AMD's Operating System Research Center. They're testing entire (virtualized) operating systems with TAP. I suspect they'd have very different informational needs. They're also heavy users of YAML in their TAP, though I don't know exactly how they're using it @renormalist: mind telling us some more? |
And the worst part is that it's not consistent as to what goes into those sections. For example, the code snippet could be a list of line numbers, or the actual code (relevant bit from TAP-Y spec). I don't think it should be taken verbatim if taken at all. |
@beatgammit I stopped reading after spotting I would be interested in learning more examples from what Tapper is using TAP for and embedding that into the YAML. |
How common is it to transport TAP over the wire and are there any gotchas that need to be accounted for with any transport? I hope that nearly every transport can handle text data, and I don't see TAP ever moving away from text. One thing that may be worth specifying is a mime-type. If TAP is stored in a file or transmitted over HTTP or any other transport that includes the encoding in metadata, is that |
In my mind they are separate standards, but I do see value in keeping them interoperable enough that one can trivially convert one into the other. In particular, synchronizing with most fields of a TAP-Y/J test document where/when appropriate seems sensible. |
Just to chime in with some info about Tapper. We use TAP because that's easy to generate in a world without dependencies, like when we produce it in shell scripts to have it work on a wide range of old-to-new Linux distributions on as many hardware platforms as possible, or in low-level modules where an echo/printf/printk is all you have. We also transport data in it. There we clearly separate between the transport and semantic. In TAP itself the border is a bit blurry because the lines are both, a syntax and a semantic (description, #TODO/#SKIP directives, etc.). However, inside TAP we use YAML to transport benchmark data and here we defined our own schema to separate it from other data, think of "X-" headers in SMTP. In particular we defined a key "BenchmarkAnythingData" which defines the simple schema inside consisting of the NAME of a metric, and it's value, plus any additional free-style key/value pairs to further describe that data point, like UNIT or the execution context. See https://github.com/renormalist/tap-dom/blob/master/t/some_tap6_autotapversion.txt for a small TAP with YAML example. We prefer YAML over JSON for 2 reasons: YAML traditionally better fits to the line-based philosophy of TAP, and JSON is language dependent, in particular the boolean barewords true/false don't exist in all languages, which hurts us particularly in our low-level bash/linux/echo/printf world. To access TAP as a data structure we convert it into a [TAP::DOM](see https://metacpan.org/pod/TAP::DOM#STRUCTURE). From there I find the contained benchmarks using Data::DPath with paths like So to summarize: as far as I understand it, our approach is kind of the reverse of TAP-Y/J We start with the semantic of TAP plus our own schema definition inside its YAML, and generate the corresponding data structures, instead of starting with structured data in the test like. The main reason is the initially mentioned easy generation in most-simplistic toolchains, which in turn is the main idea behind TAP. I hope this makes sense. :-) |
What I really liked from this thread was the idea about a mime-type This would make sense in lots of places, just think of TAP-Archives, which are essentially ZIP and could get their own mime-type Great idea. |
In most of the projects where I used TAP I was doing some kind of functional testing with selenium, or some BDD tool and reporting the tests in TAP with YAMLish information about the test (failure or not), I think quite like what @renormalist explained above. We included under the
+1 |
Great idea indeed @beatgammit +1 |
What I think has come clear from this thread which is rapidly needing closing in favour of much more focused issues.
If there is anything else interesting raised in this issue please feel free to re-raise. It seemed to be getting a little out of hand with how wide the scope was (My fault). |
Many discussions have been raised around this time and time again however I would finally like to settle this.
Currently YAML is mentioned in the TAP 13 specification on the site: http://testanything.org/tap-version-13-specification.html
However this format for debug is unspecified, which raises a few questions:
Relevant issues:
Other transports worth mentioning:
YAML format:
What I wrote for ESLint as a debug is probably a good starting place for a discussion:
https://github.com/eslint/eslint/blob/master/lib/formatters/tap.js
Thanks for all the input so far: @jcelliott and @TestAnything/owners
The text was updated successfully, but these errors were encountered: