Skip to content

File Widget: Can't switch column type from categorical to datetime #2974

Closed
@tbuttle

Description

When loading my CSV or Excel file with a Date field Orange does not recognize or provide the ability to change the field type to DateTime. All date fields come in as Categorical data. I'm trying to do a time series prediction but without a the dates being recognized I'm unable to build out my prediction model. I've tried date in several different formats to no avail.

Please note that the format per Orange's documentation matches what is in my data set. YYYY-MM-DD is the required format and is one of the format's I've tried to get Orange to recognize. Please see documentation & screenshots below.

date time specs - orange
time series data - orange

Activity

kernc

kernc commented on Mar 26, 2018

@kernc
Contributor

The following minimal example file works for me: date-test.csv.zip (.xls or .xlsx work the same).

Can you attach a small example of your data?

tbuttle

tbuttle commented on Mar 26, 2018

@tbuttle
Author

File Attached -

Orange isn't recognizing any of the date formats attached. I'm able to change the "Numeric" values to date/time but unable to do so when the dates register as "Categorical" values.

time series sample - orange

Date Sample.xlsx

kernc

kernc commented on Mar 26, 2018

@kernc
Contributor

Right. The columns are heuristically marked as categorical due to only some 60 unique values for some 2.5k rows. The real issue here imho is that the widget doesn't allow switching from categorical to datetime.

You can force Orange to interpret a column as datetime by prefixing the name with "T#", e.g.

T#Date

See also: http://orange-visual-programming.readthedocs.io/loading-your-data/index.html#header-with-attribute-type-information

changed the title Orange is not recognizing a Date/Time Field when loading a CSV or Excel File despite format matching specified documentation File Widget: Can't switch column type from categorical to datetime on Mar 26, 2018
ajdapretnar

ajdapretnar commented on Mar 28, 2018

@ajdapretnar
Contributor

The real issue here imho is that the widget doesn't allow switching from categorical to datetime.

+1 on this. An issue was opened a while ago with similar concerns: #1520.

nemontemi

nemontemi commented on Feb 21, 2019

@nemontemi

It seems that the column containing the datetime data defaults to "categorical" (without possibility to change) if the column is non-unique, i.e., if there are duplicate datetimes within that column.

self-assigned this
on Mar 8, 2019
drgooo

drgooo commented on Mar 11, 2019

@drgooo

I had the same issue. The values didn't follow the required datetime format for Orange. Adding a small python script solved the issue. Might not be the cleanest but did the job for me.

It takes the input file, creates a new column and sets the value to the correct string format.

import datetime
from Orange.data import Domain, Table, TimeVariable

#Make a new domain based on existing columns and adding the extra 'newDate' one
new_domain = Domain(["Fiscal Date", "Column2", TimeVariable.make("newDate")], new_data.domain.class_vars, source=new_data.domain)
#Construct a new table based on the new domain, using inputdata
new_data = Table(new_domain, in_data)

#format the date to align to Orange's requirements
for inst in new_data:
    inst[2] = str(datetime.datetime.strptime(str(inst[0]), "%Y-%m-%d"))`

Looking at your table, you'll need lines of code to set your column 1, 3 and 7 into new columns. I included your first column in the domain definition as an example, but you'd need to add the rest if you wanted them in the final output

bhavin83012

bhavin83012 commented on Mar 14, 2019

@bhavin83012

Hello,

Software: Orange V 3.20

I loaded excel file, which include date column. I matched the date format to ISO. Orange file widget could not recognize the date and loaded as Categorical. On the data table, there are strange numbers loaded on Date column. I tried with different files and same is repeating. (Please see below screenshot.)
2019-03-14 13_59_10-_

Then, I tried with T# to enforce the date format to the column. Orange did recognize that as date, but, this time the date range was completely different. The dates started in 1970-01-01 hh:mm:ss. Below screenshot.
image

It will be very helpful if someone can help me to get this issue fixed. I am not a programmer, so I can not see the solution in this matter.
Below is the screenshot of the file.
image

The file is attached here with.
date-test - Copy.xlsx

ajdapretnar

ajdapretnar commented on Mar 15, 2019

@ajdapretnar
Contributor

@bhavin83012 This is not an issue of Orange, but an issue of Excel. Excel tries to be smart and once it recognizes this is a datetime variable, it reformats is behind the scenes. You need to set the number format to Text to force Excel not to mess with your data.

ajdapretnar

ajdapretnar commented on Mar 19, 2019

@ajdapretnar
Contributor

This is not solved. File widget's domain editor should enable changing categorical to datetime if possible.

bhavin83012

bhavin83012 commented on Mar 19, 2019

@bhavin83012

Hi Ajda,
Your advise is useful - I worked out by below way.
The original excel files opened with Open Office software - (Say Libero Office). Then change the date format to ISO - Then you can use the file as it is or convert it to CSV and that should work in Orange.

Thank you for your support.

19 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      File Widget: Can't switch column type from categorical to datetime · Issue #2974 · biolab/orange3