-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
241 lines (156 loc) · 9.42 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
---
title: "wikimapR, an R package for importing Wikimapia data as Simple Features via API"
keywords: "wikimapia, API, sf, Simple Features, R"
output: github_document
# output:
# rmarkdown::html_vignette:
# self_contained: no
#
# md_document:
# variant: markdown_github
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r opts, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
warning = TRUE,
message = TRUE,
width = 120,
comment = "#>",
fig.retina = 2,
fig.path = "man/figures/README-"
)
```
[![DOI](https://zenodo.org/badge/161494897.svg)](https://zenodo.org/badge/latestdoi/161494897){target='_blank'}
`wikimapR` is an R package for accessing the raw vector data from Wikimapia via official [Wikimapia API](http://wikimapia.org/api){target='_blank'}. Map data is returned as
[Simple Features (`sf`)](https://cran.r-project.org/package=sf){target='_blank'} objects with some of the object details included as nested `lists`.
WARNING: the package may NOT work for now because of the breaking change in the Wikimapia API. More details at [https://github.com/e-kotov/wikimapR/issues/2](https://github.com/e-kotov/wikimapR/issues/2){target='_blank'}. Using 'example' API key does not seem to work, however everything seems to be working with private API keys.
**This package is at a VERY alpha stage. Provided 'as is'. Use with caution.**
You may also want to try [a similar package for Python](https://github.com/plandex/wikimapia-api-py){target='_blank'}.
### Installation
To install:
```{r install_package, eval=FALSE}
# Install the development version
# install.packages("remotes")
remotes::install_github("e-kotov/wikimapR")
```
To load the package and check the version:
```{r enable_package, eval=TRUE}
library(wikimapR)
packageVersion("wikimapR")
```
Load additional packages:
```{r message=FALSE, warning=FALSE}
library(sf)
library(magrittr)
library(dplyr)
library(purrr)
library(progress)
```
### Usage
Set your API key. There is the API key 'example' that's used by the testing page [https://wikimapia.org/api/?action=examples](https://wikimapia.org/api/?action=examples){target='_blank'}. It is limited to one request every 30 seconds. Get your own API key at [https://wikimapia.org/api/?action=my_keys](https://wikimapia.org/api/?action=my_keys){target='_blank'} to get up to 100 requests in 5 minutes.
```{r set_api_key}
# change to your own key
set_wikimapia_api_key("your_key")
# set_wikimapia_api_key("example) # use 'example' key with 1 request per 30 seconds rate limit
```
#### Choose a bbox
Have a look at [https://boundingbox.klokantech.com](https://boundingbox.klokantech.com){target='_blank'} and choose a bounding box. For example: `-1.617122,53.764541,-1.467519,53.831568` for Leeds and it's surroundings.
```{r set_bbox, eval=TRUE}
bbox <- c(-1.669629,53.739816,-1.422465,53.869239) # Leeds
```
#### Get the number of objects in this bounding box
Use [`box`](http://wikimapia.org/api#oldbox){target='_blank'} API function to get the number of features in your area of interest.
```{r estimate_n, eval=TRUE}
wm <- wm_get_from_bbox(x = bbox, get_location = FALSE, meta_only = TRUE)
wm$found
```
Now we know how many objects we have in the bounding box.
There are `page` and `count` parameters in the Wikimapia API, but you cannot request objects from a bounding box beyond 10 000 no matter how high you set the `page` parameter. So you have to subdivide the large bounding box that you have into smaller bounding boxes with a maximum of 10 000 objects each.
#### Subdivide the bounding box into smaller ones
`subdivide_bbox()` subdivides a large bounding box into smaller ones and returns `sf polygons`, or `bbox` objects or both. It is good for large areas with defaults tuned to cities like Moscow.
```{r subdivide_bbox, eval=TRUE}
small_bboxes <- subdivide_bbox(x = bbox, bbox_cell_size = 0.1, return_bbox_or_sf = "both")
plot(small_bboxes$sf$geometry)
```
In this example the bounding box cell size is in degrees (need to somehow fix that to work with meters across the globe). Default of 0.045 degrees is reasonably large, roughly equivalent to 2845x5010 meters. It has proved to fit < 10 000 objects per cell in Moscow, where the density of objects is quite high. For less object-dense cities you may be able to get away with larger grid of bounding boxes.
This is not ideal and it would be more convenient to create this subdivision using precise metric, but this will do fine for now.
#### Check then number of objects per cell
Just to be sure that every bounding box that we generated has <= 10 000 objects, let us query all the bounding boxes. For the current example with `r length(small_bboxes$bbox)` it will take about `r length(small_bboxes$bbox)/2` minutes, as with "example" API key the cool-down is about 30 seconds.
```{r estimate_subdivided_n, cache=FALSE, message=FALSE, warning=FALSE}
# get objects in every bbox, but no need to get the location for now
objects_in_bboxes <- small_bboxes$bbox %>% purrr::map(
~wm_get_from_bbox(x = .x, get_location = FALSE), .progress = T )
```
Now we look at the histogram of the number of objects in the small bounding boxes and the maximum value:
```{r b_per_bbox, eval=TRUE}
n_by_bbox <- objects_in_bboxes %>% purrr::map_int(~ .x$meta$found) # extract the number of found objects for every bounding box
max(n_by_bbox)
hist(n_by_bbox)
```
Since the maximum is well within 10 000, we can proceed to collect the objects IDs and then download attributes for individual features. We have to download the detailed objects features one-by-one as [`box`](http://wikimapia.org/api#oldbox){target='_blank'} API only returns object ID, name and URL. So the strategy for getting all object details is to make a list of object IDs using [`box`](http://wikimapia.org/api#oldbox){target='_blank'} API and then use [`place.getbyid`](http://wikimapia.org/api#placegetbyid){target='_blank'}.
#### Create a list of IDs to fetch
```{r id_list, eval=TRUE, cache=FALSE}
id_list <- objects_in_bboxes %>%
purrr::map(~ .x$df$id) %>% # pull object IDs from individual bbox query results data.frames
unlist() # bind together
str(id_list)
head(id_list)
```
Now we have a list of `r length(id_list)` object IDs that we want to get the details for.
#### Get detailed data for Wikimapia objects
Let's take just 3 first objects for this example. It will take up to 1.5 minutes to fetch them with all the details using the 'example' API.
```{r get_indiv_objects, cache=FALSE, message=FALSE, warning=FALSE}
short_list <- id_list[1:3]
wm_objects <- wm_get_by_id(ids = short_list)
```
#### We have the objects!
```{r obj_str}
str(wm_objects, max.level = 1, nchar.max = 50)
```
##### We can now plot them
```{r obj_plot}
plot(wm_objects$geometry)
```
##### We can explore the attributes
```{r obj_columns}
names(wm_objects)
```
##### Most of the details are currently in a nested list for every object
You can use [purrr](https://github.com/tidyverse/purrr){target='_blank'} and/or [rlist](https://github.com/renkun-ken/rlist){target='_blank'} packages to pull any of the details from these nested lists.
```{r obj_details}
str(wm_objects$details[[1]], max.level = 1)
```
### To-Do List
- Rewrite `subdivide_bbox()` to accept metric values for bbox dimensions (not critical, but may be useful for other projects)
- Create a few more helper functions to abstract the user from the calls to `purrr` for simple things like getting the number of found objects, or for getting number of objects per bounding box (see examples above in the Usage section)
- Create a hidden environment variable for storing API key and using it automatically in the package functions
- Implement the rest of the Wikimapia API functions
- Fix bugs if any
- Make code more robust
- Write unit tests
- Your suggestions are welcome via GitHub issues for this package
- Submit the package to CRAN someday..?
### Citation
```{r comment = "", eval=FALSE}
citation("wikimapR")
print(citation("wikimapR"), bibtex=TRUE)
```
To cite wikimapR in publications, please use:
> Kotov E. (2018). wikimapR: Import Wikimapia Data as Simple Features via API. DOI: 10.5281/zenodo.3459878. https://github.com/e-kotov/wikimapR
A BibTeX entry for LaTeX users is:
```
@Manual{,
title = {wikimapR: Import Wikimapia Data as Simple Features via API},
author = {Egor Kotov},
doi = {10.5281/zenodo.3459878},
year = {2018},
url = {https://github.com/e-kotov/wikimapR},
}
```
### License
The MIT License (MIT) + License File
Copyright © 2018 Egor Kotov
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.