Downloading and parsing of 27M XML datasets

Cancelado Publicado May 12, 2016 Pagado a la entrega
Cancelado Pagado a la entrega

1. Please download 27M datasets from [login to view URL] to [login to view URL]

2. Parse the following fields and store them into MySQL

<PMID>

<ArticleDate>

<ArticleTitle>

<Language>

<Journal>

<Keyword>

<Author>

<AffiliationInfo>

Parsing details:

- Make sure to parse first and last name of the Authors

- Because Keywords can be multiple, please store then in an extra table and use a mapping table to map them to the articles.

- Create a mapping table between Author and PMID, as one PMID can have n Authors.

- Store assign unique IDs to Authors and AffiliationInfo and store them in two tables to then map them because the same author can be linked to n AffiliatenInfo

- Please parse AffiliationInfo in different fields, seperated by comma - like on [login to view URL]

<AffiliationInfo>

<Affiliation>Institute of Pathology, University of Heidelberg, Heidelberg, Germany ; Present address: Institute of Pathology, Elbe Kliniken, Klinikum Stade, Bremervörder Str. 111, D- 21682 Stade, Germany.</Affiliation>

</AffiliationInfo>

Becomes:

AffiliationInfoLine01: Institute of Pathology

AffiliationInfoLine02: University of Heidelberg

AffiliationInfoLine03: Heidelberg

AffiliationInfoLine04: Germany

AffiliationInfoLine05: Present address: Institute of Pathology

AffiliationInfoLine06: Elbe Kliniken

AffiliationInfoLine07: Klinikum Stade

AffiliationInfoLine08: Bremervörder Str. 111

AffiliationInfoLine09: D- 21682 Stade

AffiliationInfoLine10: Germany

Entrada de datos MySQL PHP Python Shell Script

Nº del proyecto: #10476684

Sobre el proyecto

22 propuestas Proyecto remoto Activo Nov 29, 2016

Adjudicado a:

defkrie

Bonjour, Working for 8 years as a consultant I would be happy to perform this mission. I like to innovate, build, participate in the design of complex architectures subject to strong constraints (volume, availabili Más

$100 USD en 5 días
(13 comentarios)
4.5

22 freelancers están ofertando un promedio de $135 por este trabajo

TenStar718

Hello, and thanks for the opportunity to bid on your project. https://www.freelancer.com/u/TenStar718.html I am an expert in many different area’s of web and mobile applications based on the following languages: W Más

$157 USD en 1 día
(255 comentarios)
8.9
seaanddream

Hi, my name is Sevinc. I read your "Downloading and parsing of 27M XML datasets" project descriptions carefully before bidding. I checked the urls you provided, along with your required data fields as well... I got w Más

$105 USD en 3 días
(322 comentarios)
8.8
ZhangDa

Hi, I visited one URL to see the XML format. So there will be 4 tables in total, correct? *table_Article -> the main table The columns are you listed in the job description *table_Keyword -> the map of article an Más

$180 USD en 10 días
(59 comentarios)
6.9
ersharmadinesh19

Hi, Greetings of the day!! Thanks for review my Bid. I have gone through from your description and can do this work comfortably. Please open this PMB. Waiting for your response!! Regards, Ashi Más

$155 USD en 3 días
(95 comentarios)
6.3
RRajeshR

Hi, I am interested to work on your project. I can extract the 27M data and provide you the result data. I am not sure about the ip block on this portal when I extract the data. I will try to use the VPN to avoid if a Más

$250 USD en 10 días
(54 comentarios)
6.4
sonarkaushik

Sir, I am well versed in this kind of jobs and can do your project as per requirement. I have over 8 years of experiences. I am very much able to work on this. ***I am ready to start

$188 USD en 6 días
(117 comentarios)
6.4
sylar1015

hello, sir: c/c++/python expert worked for samsung & huawei maybe more details will be helpful a sample can be provided before hired. hope to get message from u ty

$155 USD en 2 días
(24 comentarios)
5.3
ahavic1

Hi, I have a lot of experience in web scraping and data extraction with python. I know my way around with mysql database and I can provide you finished project within 3 days. Look forward to hearing from you.

$130 USD en 3 días
(8 comentarios)
3.6
gavinsal

Hi, I'm relatively new to freelancing, but I have 19 years of experience in IT and software development, including in Python, PHP, MySQL and scripting and web scraping. What you're asking for seems relatively straig Más

$200 USD en 3 días
(3 comentarios)
3.7
Bakhtiyor

Dear Sir or Madam, I am writing to express my interest in this project. My name is Bakhtiyor, I am a web developer from Tajikistan with more than 10 years of experience in this field. My mother tongue is Tajik and I Más

$111 USD en 0 días
(3 comentarios)
2.2
jyericlin

In a team that built one of Canada's largest academic cloud computing platform. Designed and Built a monitoring big data platform using Hadoop technology and deployed on the cloud platform. Design, and develop a number Más

$200 USD en 3 días
(1 comentario)
2.2
intrashell

Hello, I have done this a few different ways once in Java and twice in PHP. 27 million entries shouldn't be a problem. I already have a library that handles MySQL entries of 1,000 inserts per 5ms. All three times w Más

$50 USD en 2 días
(1 comentario)
1.1
hardiktrivedi700

Hello sir, We do all type of data entry works. And I have a staff of 15 people and more then 10 typist as a backup. Regarding accurancy and TImeline I always give a guranteed task. You can surely expect a power Más

$30 USD en 3 días
(0 comentarios)
0.0
divyadilip

I am looking for an opportunity to utilize my skills and abilities working with organization that offer professional growth while being resourceful, innovative and flexible

$30 USD en 1 día
(0 comentarios)
0.0
ggogi016

Hi, If you are ok with the proposal, the date will be parsed as soon the files are donwloaded (as there are alot of files this can take some time). Best regards, Goran

$30 USD en 3 días
(0 comentarios)
0.0