Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed double quotes copy and paste bug #2202

Merged
merged 11 commits into from
Sep 7, 2022

Conversation

amy-morrill
Copy link
Contributor

@amy-morrill amy-morrill commented Aug 22, 2022

Addresses #2185
I created a new parse function based on the previously used SheetClip.parse function, removing the parts which replace consecutive double quotes with one double quote. Moreover I added a test in components/dash-table/tests/selenium/test_basic_copy_paste.py which verifies that consecutive double quotes are un-modified when copied and pasted into a dash-table.

Contributor Checklist

  • I have broken down my PR scope into the following TODO tasks
    • Replace SheetClip.parse with inline function fixing bug
    • Add a test to verify that bug is fixed
  • I have run the tests locally and they passed. (refer to testing section in contributing)
  • I have added tests, or extended existing tests, to cover any new features or bugs fixed in this PR

optionals

  • I have added entry in the CHANGELOG.md
  • If this PR needs a follow-up in dash docs, community thread, I have mentioned the relevant URLS as follows
    • this GitHub #PR number updates the dash docs
    • here is the show and tell thread in Plotly Dash community

if (temprows.length > 1 && temprows[temprows.length - 1] === '') {
temprows.pop();
}
const rows = temprows.map(row => row.split('\t'));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tracking down the original SheetClip.parse. I'm a little worried that this simple replacement - while fixing the problem you encountered with multiple quotes within a cell - will cause other problems due to extra cases that were covered by SheetClip and missed here. I'm thinking particularly about cells that have tab or newline characters in them, which presumably are the whole reason for all that monkey business with quotes, so if a cell starts and ends with quote characters it can have tabs and newlines inside it and these don't cause the parser to move to the next column or row.

Then of course we could ask what happens if there are ALSO quote characters inside the cell text. I'd have guessed these would be escaped (with a backslash before the quote) but I see no code in the original SheetClip.parse to handle that so perhaps the quote character is doubled? That would explain their "replace two quotes with one" step, it's just applied too broadly.

Anyway I think the goal of SheetClip as expressed in its head comment is our goal as well:

This tiny library transforms JavaScript arrays to strings that are pasteable by LibreOffice, OpenOffice, Google Docs and Microsoft Excel.

Meaning we want to be able to copy and paste rectangular sections of dash table to and from common spreadsheet programs, as well as to other dash tables, regardless of what text is in those cells. So I think this requires a little more investigation about how such cells are encoded by spreadsheets - and it may mean we have more work to do on the copy side as well as on paste.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes a lot of sense -- I was wondering if this solution was too simplistic.

One thought I have (which may also be too simplistic) is that when parsing pasted cells we could first "escape" quotes by adding a backslash before each quote and then keep the SheetClip.parse function the same except for changing .replace(/""/g, '"') steps to something like .replace('\"', '"'). Does that general idea sound okay to you or am I oversimplifying?

Regardless, I'm happy to keep looking at this. Thank you again for your help!

@amy-morrill amy-morrill marked this pull request as draft August 23, 2022 15:38
@amy-morrill amy-morrill marked this pull request as ready for review September 1, 2022 15:51
@amy-morrill
Copy link
Contributor Author

In response to the comments left on my first attempt at this PR, I've edited my parse function to preserve more of the logic from the SheetClip parse function. In particular, my parse function now allows cells with multiple lines to be pasted into a dash table from an outside source.

I discovered that when a cell with multiple lines is copied from google sheets, excel, etc, its contents are wrapped in double quotes, and every double quote within the cell is replaced by two consecutive double quotes, explaining SheetClip's logic to replace two consecutive double quotes with one single double quote. However, when a cell without multiple lines is copied from google sheets, excel, etc, double quotes within the cell are not replaced by two consecutive double quotes, which leads to the bug I found -- consecutive double quotes erroneously being replaced by a single double quote. My parse function now only replaces consecutive double quotes with one double quote in cells with multiple lines. Thus, my PR preserves the intention of SheetClip's parse function, while fixing the bug I found.

To give you an example of the copy/paste behavior I discuss above, I copied the following cells from a google sheets table:
Screen Shot 2022-09-01 at 11 05 12 AM
I then added a log statement to the fromClipboard function in TableClipboardHelper:
Screen Shot 2022-09-01 at 11 26 03 AM
Then I pasted those cells into a dash-table, getting the following log message:
Screen Shot 2022-09-01 at 11 29 01 AM

Copy link
Collaborator

@alexcjohnson alexcjohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💃 Nicely done! The quote-doubling behavior is funny, must be some extremely old artifact from Excel... nice job tracking that down!

Ready to go after my last two comments.

@amy-morrill
Copy link
Contributor Author

Ready to go after my last two comments.

Great!! Do you have any idea when this change will be released to dash users? It affects a project I'm working on, so I want to report the timeline back to my boss

@alexcjohnson alexcjohnson merged commit dd13278 into plotly:dev Sep 7, 2022
@alexcjohnson
Copy link
Collaborator

@amy-morrill a number of bug fixes have landed so it's about time for us to make a patch release. There are a couple of other loose ends we're trying to tie up but likely end of this week or early next we'll make a new release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants