# 数据可视化完美指南-R-python

## 有什么样的数据做什么样的图

`What kind of data do you have? Pick the main type using the buttons below. Then let the decision tree guide you toward your graphic possibilities.`

Yan Holtz 和Conor Healys两个人关系很好，一起在业余时间开发了这个网站。基于R和Python做的源代码，这里我们不仅可以得到大量优秀的源代码，同时我们可以得到一张决策树，用于知道如何使用代码。这两个人相当厉害了，不仅仅给大家了工具，还叫大家如何使用。作为无私的分享，如果对大家有用，请在文章中致谢他们。如果我们需要交流代码，和谁交流呢？那必须是Yan Holtz，这位主要负责代码部分。Conor Healys负责图形设计工作。

### 基于有顺序的二维数据框的出图

```# Libraries
library(tidyverse)
# ?as.Date
# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/3_TwoNumOrdered.csv", header=T)
data\$date <- as.Date(data\$date)```

### 这里仅仅提取最后的十个数据进行点线图的可视化

```# Plot
data %>%
tail(10) %>%
ggplot( aes(x=date, y=value)) +
geom_line(color="#69b3a2") +
geom_point(color="#69b3a2", size=4) +
ggtitle("Evolution of Bitcoin price") +
ylab("bitcoin price (\$)") +
theme_ipsum()```

### 这里使用最后的60个数据进行可视化

```# Plot
p1 <- data %>%
tail(60) %>%
ggplot( aes(x=date, y=value)) +
geom_line(color="#69b3a2") +
ggtitle("Line chart") +
ylab("bitcoin price (\$)") +
theme_ipsum()

p2 <- data %>%
tail(60) %>%
ggplot( aes(x=date, y=value)) +
geom_line(color="#69b3a2") +
geom_point(color="#69b3a2", size=2) +
ggtitle("Connected scatterplot") +
ylab("bitcoin price (\$)") +
theme_ipsum()

p = p1 + p2
p```

### 散点图展示时间序列

```# Plot
data %>%
tail(60) %>%
ggplot( aes(x=date, y=value)) +
geom_point(color="#69b3a2", size=2) +
ggtitle("Line chart") +
ylab("bitcoin price (\$)") +
theme_ipsum()```

### 分组时间序列可视化

```library(babynames)

# Load dataset
data <- babynames %>%
filter(name %in% c("Ashley", "Amanda")) %>%
filter(sex=="F")

#plot
data %>%
ggplot( aes(x=year, y=n, group=name, color=name)) +
geom_line() +
scale_color_viridis(discrete = TRUE, name="") +
theme(legend.position="none") +
ggtitle("Popularity of American names in the previous 30 years") +
theme_ipsum()```

### geom_segment函数突出展示变化趋势

```library(grid) # needed for arrow function
library(ggrepel)

# data
tmp <- data %>%
filter(year>1970) %>%
select(year, name, n) %>%
spread(key = name, value=n, -1)

# data for date
tmp_date <- tmp %>% sample_frac(0.3)

tmp%>%
ggplot(aes(x=Amanda, y=Ashley, label=year)) +
geom_point(color="#69b3a2") +
geom_text_repel(data=tmp_date) +
geom_segment(color="#69b3a2",
aes(
xend=c(tail(Amanda, n=-1), NA),
yend=c(tail(Ashley, n=-1), NA)
),
arrow=arrow(length=unit(0.3,"cm"))
) +
theme_ipsum()```

```data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/3_TwoNumOrdered.csv", header=T)
data\$date <- as.Date(data\$date)

p1 <- data %>%
tail(10) %>%
ggplot( aes(x=date, y=value)) +
geom_line(color="#69b3a2") +
geom_point(color="#69b3a2", size=4) +
ggtitle("Not cuting") +
ylab("bitcoin price (\$)") +
theme_ipsum() +
ylim(0,10000)

p2 <- data %>%
tail(10) %>%
ggplot( aes(x=date, y=value)) +
geom_line(color="#69b3a2") +
geom_point(color="#69b3a2", size=4) +
ggtitle("Cuting") +
ylab("bitcoin price (\$)") +
theme_ipsum()

p1 + p2```

### reference

