r-R语言入门基础-R语言绘图基础


r-R语言入门基础-R语言绘图基础

1. plot绘图基础

x <- c(1,2,3,4)
y <- c(5,1,2,9)
plot(x,y)

plot(x)

  • 使用plot绘制factor
y <- factor(c("a","b","c","b","a","a","a","C","c"))
plot(y)

a <- sample(c("a","b","c"), 10, replace=T)
a
##  [1] "b" "a" "b" "b" "c" "c" "a" "b" "a" "c"
f <- factor(a)
f
##  [1] b a b b c c a b a c
## Levels: a b c
plot(f)

2. boxplot

举个栗子

par(mfrow=c(1,3))
a <- sample(1:100, 100, replace = T)
a
##   [1]  6 63 53 71 14  2 69 75 35 33 40 78 77 94  6 37 32 87 31 40 76 52 51 91 22
##  [26] 58 76 84 59 18 92 40 19  4 58 65 93 55  4 92 51  3 22 85 69 41 59 26 88 63
##  [51] 87 47 49 11 30 71  4 86 64 11 94  6 50 31 71 73 83 88 63 57 19 81 55 40 69
##  [76] 93 18 19 54 48 71 93 16 28 15  4 29 14 90 24 42 42 40 91 63 17 48 76 65  1
boxplot(a)

如果你不懂par的用法

?par

然后就会在Rstudio右下角出现帮助信息

然后par(mfrow=c(1,3))的意思就是一个图版显示1行3列

添加几个异常值

b <- c(a, -180:-190, 181:180)
boxplot(b)

这样就看到在max最大值之上,min最小值之下,都有一些"离群点"

接下来引用R自带的一个数据集InsectSprays

class(InsectSprays)
## [1] "data.frame"
summary(InsectSprays)
##      count       spray 
##  Min.   : 0.00   A:12  
##  1st Qu.: 3.00   B:12  
##  Median : 7.00   C:12  
##  Mean   : 9.50   D:12  
##  3rd Qu.:14.25   E:12  
##  Max.   :26.00   F:12
InsectSprays
##    count spray
## 1     10     A
## 2      7     A
## 3     20     A
## 4     14     A
## 5     14     A
## 6     12     A
## 7     10     A
## 8     23     A
## 9     17     A
## 10    20     A
## 11    14     A
## 12    13     A
## 13    11     B
## 14    17     B
## 15    21     B
## 16    11     B
## 17    16     B
## 18    14     B
## 19    17     B
## 20    17     B
## 21    19     B
## 22    21     B
## 23     7     B
## 24    13     B
## 25     0     C
## 26     1     C
## 27     7     C
## 28     2     C
## 29     3     C
## 30     1     C
## 31     2     C
## 32     1     C
## 33     3     C
## 34     0     C
## 35     1     C
## 36     4     C
## 37     3     D
## 38     5     D
## 39    12     D
## 40     6     D
## 41     4     D
## 42     3     D
## 43     5     D
## 44     5     D
## 45     5     D
## 46     5     D
## 47     2     D
## 48     4     D
## 49     3     E
## 50     5     E
## 51     3     E
## 52     5     E
## 53     3     E
## 54     6     E
## 55     1     E
## 56     1     E
## 57     3     E
## 58     2     E
## 59     6     E
## 60     4     E
## 61    11     F
## 62     9     F
## 63    15     F
## 64    22     F
## 65    15     F
## 66    16     F
## 67    13     F
## 68    10     F
## 69    26     F
## 70    26     F
## 71    24     F
## 72    13     F

注意,~表示因果关系,这里的意思是横坐标(自变量)为count,纵坐标(因变量)为spray

boxplot(count~spray, data=InsectSprays)

加上颜色

boxplot(count~spray, data=InsectSprays, col=2:7)

进阶的boxplot

boxplot(count~spray, data = InsectSprays, col = "lightgray")

boxplot(count~spray, data = InsectSprays, col = "lightgray")
boxplot(count~spray, data=InsectSprays,notch=T, add=T, col=2:7)
## Warning in bxp(list(stats = structure(c(7, 11, 14, 18.5, 23, 7, 12, 16.5, : some
## notches went outside hinges ('box'): maybe set notch=FALSE

notch关键字: 保证我绘的图有凹进去的一小块 add关键字: 将图画在原来的图之上

3. 使用plot函数一样可以绘制箱线图(boxplot)

plot将第一个参数输入为factor就可以画箱图

par(mfrow=c(1,2))
y <- c(10,506,140,200)
x <- c(1,2,1,2)
plot(x,y)


a <- y
b <- x
b <- as.factor(b)
plot(b,a)

下面实现这张图

y <- c(88,99,66,77,88,97,33,55,66,99,88,99,77,55,66,77,98,99,96,90,80)
y
##  [1] 88 99 66 77 88 97 33 55 66 99 88 99 77 55 66 77 98 99 96 90 80
f <- factor(c(rep("班级1",6), rep("班级2",3), rep("班级3",5), rep("班级4",7)))
f
##  [1] 班级1 班级1 班级1 班级1 班级1 班级1 班级2 班级2 班级2 班级3 班级3 班级3
## [13] 班级3 班级3 班级4 班级4 班级4 班级4 班级4 班级4 班级4
## Levels: 班级1 班级2 班级3 班级4
plot(f,y)

4. plot来绘制Dataframe信息

  • plot(df): df是Dataframe
  • plot(~expr): expr是对象名称的表达式
  • plot(y~expr): y是任意一个对象
df <- data.frame(
  age=c(10,12,13),
  height=c(150,160,170),
  weight=c(50,60,70)
)
df
##   age height weight
## 1  10    150     50
## 2  12    160     60
## 3  13    170     70
plot(df)

plot(~age+height,data=df)

plot(weight~age+height, data=df)

5. pairs函数绘制matrix和Dataframe

x <- matrix(1:9, nrow = 3)
x
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9
pairs(x)

x <- matrix(1:10, nrow = 2)
x
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    3    5    7    9
## [2,]    2    4    6    8   10
pairs(x)

所以一列就是一个分量var

x <- matrix(1:10, nrow = 5)
x
##      [,1] [,2]
## [1,]    1    6
## [2,]    2    7
## [3,]    3    8
## [4,]    4    9
## [5,]    5   10
pairs(x)

这里就是两个分量var

6. coplot函数绘制协同图

a <- c("a","b","a","b","a","b")
n1 <- c(1,2,3,4,5,6)
n2 <- c(100,200,300,400,500,600)
df <- data.frame(a,n1,n2)
df
##   a n1  n2
## 1 a  1 100
## 2 b  2 200
## 3 a  3 300
## 4 b  4 400
## 5 a  5 500
## 6 b  6 600
coplot(n1~n2|a)

意思就是n1做纵坐标(因变量), 左边是含有"a"字符的feature的n2的值作为横坐标(自变量),右边是含有"b"的n2值作为横坐标

所以coplot可以绘制的就是,在给定一个feature,这里是factor a的情况下,分别绘制另外两个feature(这里是n1,n2)之间的关系

7. hist绘制直方图

data <- c(rep(1,20),rep(2,11),rep(3,6))
hist(data,breaks = c(0.5,1.5,2.5,3.5))

beaks断点,控制直方图每个方形的起始 下面了解freq,控制显示Frequency频次还是Density频率, main控制title

par(mfrow=c(1,2))
hist(data,breaks = c(0.5,1.5,2.5,3.5), freq = T, main="freq=T")
hist(data,breaks = c(0.5,1.5,2.5,3.5), freq = F, main="freq=F")

太丑了,加颜色

par(mfrow=c(1,2))
hist(data,breaks = c(0.5,1.5,2.5,3.5), freq = T, main="freq=T",col = "red")
hist(data,breaks = c(0.5,1.5,2.5,3.5), freq = F, main="freq=F",col = rainbow(3))

8. dotchart绘图

横轴是数值,纵轴是标签

这里引用R自带的数据VADeaths,为Virginia州在1940年的人口死亡率

VADeaths
##       Rural Male Rural Female Urban Male Urban Female
## 50-54       11.7          8.7       15.4          8.4
## 55-59       18.1         11.7       24.3         13.6
## 60-64       26.9         20.3       37.0         19.3
## 65-69       41.0         30.9       54.6         35.1
## 70-74       66.0         54.3       71.1         50.0
dotchart(VADeaths)

9. axes-边框控制

  • axes=FALSE: 表示图形没有坐标轴
  • axes默认为TRUE
x <- 1:100
y <- rnorm(100)
par(mfrow=c(1,2))
plot(x,y)
plot(x,y,axes=F)

10. log参数,对数据取对数

  • log = "x": 表示对x轴的数据取对数
  • log = "y": 表示对y轴的数据取对数
  • log = "xy": 表示对x轴y轴的数据同时取对数
x <- c(100:400)
y <- c(100:400)
par(mfrow=c(2,2))
plot(x,y,main = "no log")
plot(x,y,log="x", main = "logX")
plot(x,y,log="y", main = "logY")
plot(x,y,log="xy",main = "logX, logY")

要注意的是! 坐标轴上的值没有log处理, 但是scale发生了变化,注意看第一个图和其他图的坐标轴的scale差别!

再举一个例子

y <- rnorm(100,mean = 1,sd = 0.2)
y
##   [1] 0.8972154 0.9868828 1.4060411 1.0122385 0.5384932 0.9977303 0.9592429
##   [8] 1.0620681 1.1337433 0.8997237 1.0115320 0.9732185 0.9456617 0.5781495
##  [15] 1.2865029 1.2278847 1.1320299 0.8819304 1.0675654 0.7752420 0.9076857
##  [22] 1.2615293 1.0013995 1.2536666 0.8251942 0.7366765 0.8645834 1.1130853
##  [29] 0.8471434 0.7640037 0.8873307 1.0599610 0.9878274 0.6712630 0.9960379
##  [36] 0.9355953 0.8708642 1.1561958 0.9808229 0.6125047 0.8659609 0.9515395
##  [43] 1.1022031 0.9996702 1.1638596 0.9594048 0.8944815 0.8065199 0.8821586
##  [50] 1.0150457 1.1861218 1.0253477 0.8010522 0.8749333 0.5589216 0.9574013
##  [57] 1.1437481 0.6841609 0.9451278 1.0226522 1.4868991 0.9302728 0.9122796
##  [64] 1.0822525 1.1018062 1.3370627 0.8979796 1.1137944 0.8568478 0.9210010
##  [71] 0.9598558 0.7420585 1.0333764 1.0539939 1.1529271 1.0216957 0.9596904
##  [78] 1.1113689 0.6952488 1.0480901 0.9988140 0.6522419 0.8093343 0.9509023
##  [85] 1.2821122 0.8465161 0.8133667 1.1961651 0.9934969 1.3025319 0.9788967
##  [92] 0.8733015 1.0560456 0.7743228 1.0444320 0.9439117 0.8520874 1.2977290
##  [99] 0.7828039 1.2057976
x <- 1:100
x
##   [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18
##  [19]  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36
##  [37]  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54
##  [55]  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72
##  [73]  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90
##  [91]  91  92  93  94  95  96  97  98  99 100
par(mfrow=c(2,2))
plot(x,y,main="no log")
plot(x,y,log="x", main = "logX")
plot(x,y,log="y", main = "logY")
plot(x,y,log = "xy",main = "logX, logY")

从no log图可以看到,数据都是集中在0.6~1.4之间的,对x取log之后发现,大量的数据是集中在1附近的,这个在no log图中就看不太明显,logxlogy同理,而logy就看不太明显

11. type参数-点的类型

type = ?

  • p: 点, 默认
  • l: 线
  • b: 点和连接线
  • o: 点覆盖在线上
  • h: 从点到x周的垂直线
  • s: 阶梯图形
  • n: 不显示数据
x <- 1:20
y <- sample(1:100, 20)
par(mfrow=c(2,2))
plot(x,y,main = "p",type = "p")
plot(x,y,main = "l",type = "l")
plot(x,y,main = "b",type = "b")
plot(x,y,main = "o",type = "o")

par(mfrow=c(2,2))
plot(x,y,main = "h",type = "h")
plot(x,y,main = "s",type = "s")
plot(x,y,main = "n",type = "n")

12. 图的标注xlab ylab main sub

  • xlab: x轴的说明
  • ylab: y轴的说明
  • main: 图的说明
  • sub: 子图的说明
x <- 1:20
y <- sample(1:1000, 20)
par(mfrow=c(1,2))
plot(x,y)
plot(x,y,xlab = "this is X label", ylab = "this is Y label", main = "this is main/title", sub = "this is sub title of the sub plot")

13. points加点函数

  • 在已有图上加点
  • 功能相当于plot(x,y)
x <- c(1:10)
y <- sample(1:200, 10)
z <- sample(150:300, 10)
par(mfrow=c(2,1))
plot(x,y,type="l")
plot(x,z,type = "l")

上面是绘制成了两个图

x <- c(1:10)
y <- sample(1:200, 10)
z <- sample(150:300, 10)
plot(x,y,type="l")
points(x,z,type = "l",col="red")

但是有一些z的值超出范围了,咋整呢?

x <- c(1:10)
y <- sample(1:200, 10)
z <- sample(150:300, 10)
plot(x,y,type="l",ylim=c(0,300))
points(x,z,type = "l",col="red")

x <- c(1:10)
y <- sample(1:200, 10)
z <- sample(150:300, 10)
plot(x,y,type="l",ylim=c(0,300))
points(x,z,type = "p",col="red")

当然也可以更改type让红线变成点或者其他样式

14. line-加线函数

  • 在已有的图上加线
  • 功能相当于plot(x,y,type="l")
x <- c(1:10)
y <- sample(1:200, 10)
z <- sample(150:300, 10)
plot(x,y,type="l",ylim=c(0,300))
lines(x,z,type = "l",col="red")

x <- c(1:10)
y <- sample(1:200, 10)
z <- sample(150:300, 10)
plot(x,y,type="l",ylim=c(0,300))
lines(x,z,type = "b",col="red")

x <- c(1:10)
y <- sample(1:200, 10)
z <- sample(150:300, 10)
plot(x,y,type="l",ylim=c(0,300))
lines(x,z,type = "h",col="red")

x <- c(1:10)
y <- sample(1:200, 10)
z <- sample(150:300, 10)
plot(x,y,type="l",ylim=c(0,300))
lines(x,z,type = "s",col="red")

15. par函数

  • par(mfrow=c(m,n)): 将绘图区分为m行n列,可以画m*n个图
  • par(new=TRUE): 叠加
x <- 1:20
y1 <- sample(1:1000,20)
y2 <- sample(1:1000,20)
plot(x,y1,type = "l")
par(new=TRUE)
plot(x,y2,col="red",type = 'o')

x <- 1:20
y1 <- sample(1:1000,20)
y2 <- sample(1:1000,20)
plot(x,y1,type = "l")
par(new=TRUE)
plot(x,y1,col="red",type = 'p')

所以我们通过这种办法就得到了黑线和红点的结合!

16. 使用par与lwd

  • lwd设置线条宽度
  • 与par结合
x <- 1:20
y1 <- sample(1:1000,20)
plot(x,y1,type = "l",col="red",lwd=10)

plot(x,y1,type="l",col="yellow",lwd=5)

神奇的结果出现了!

plot(x,y1,type = "l",col="red",lwd=10)
par(new=TRUE)
plot(x,y1,type="l",col="yellow",lwd=5)

哎!Beautiful!

17. text函数添加标记

下图的细节

  • type="n": 不绘点!
  • text(x,y)
x <- 1:20
y <- sample(1:10000000000, 20)
plot(x,y,type = "n")
text(x,y)

x <- c(1:5)
y <- c(6:10)
plot(x,y,type = "b")
text(x,y)

可以看到有个问题是字和点重合了看不清楚

x <- c(1:5)
y <- c(6:10)
plot(x,y,type = "b")
text(x+0.1,y-0.05)

这样就好看多了!

x <- c(1:5)
y <- c(6:10)
plot(x,y,type = "b")
text(x+0.1,y-0.05, labels = c("A","B","C","D"))

这里的细节是,labels来自定义值,但是看到有五个点我们只定义了4个所以,最后一个触发了R的自动补齐,也就是repeat了这个向量,也就是从A又开始了

18. abline-绘制参考线

  • abline(a,b): 绘制一条y=bx+a的直线
  • abline(h=y): 绘制一条通过所有点的水平直线
  • abline(v=x): 绘制一条通过所有点的垂直直线
plot(1:5,1:5)
abline(h=4,col="blue",lty=3)
abline(h=2)
abline(v=1,col="red",lty=2)
abline(v=4.5,lty=4,col=9)
abline(-3,3, lty=1) # y = 3x-3

细节可以?abline查看帮助文档嗷

19. polygon-给图加上多边形

x <- 1:10
y <- rnorm(x)
plot(x,y,type="l")

x <- 1:10
y <- rnorm(x)
x1 <- c(2,4,4,2)
y1 <- c(0,0,1,1)
plot(x,y,type="l")
polygon(x1,y1,col = 'pink',)

x <- 1:10
y <- rnorm(x)
x1 <- c(2,4,4,2)
y1 <- c(0,0,1,1)
plot(x,y,type="l")
polygon(x1,y1,col = 'yellow', border = 5,lty = 10)

同,了解更多使用?polygon

20. title函数给图形加标题

x <- 1:10
y <- rnorm(x)
x1 <- c(2,4,4,2)
y1 <- c(0,0,1,1)
plot(x,y,type="l")
polygon(x1,y1,col = 'pink',)
title("this is a title")

那么有人要问title和main sub的区别呢? 还是这个图,我画两次

x <- 1:10
y <- rnorm(x)
x1 <- c(2,4,4,2)
y1 <- c(0,0,1,1)

par(mfrow=c(1,2))
plot(x,y,type="l",main = "this is main 1",sub = "this is sub1")
polygon(x1,y1,col = 'pink')
plot(x,y,type="l", main = "this is main 2", sub = 'this is sub2')
polygon(x1,y1,col = 'black')

par(new=TRUE,mfrow=c(1,1))
title("this is a title")

这样子是不是理解了, 先画了一个含有俩子图的图,俩子图都有各自的main和sub 然后这时候不能直接title,直接title默认是和this is main2这个位置冲叠的, 可以试试! 画到中间的位置思路就是新开一个重叠画布

21. axis-控制坐标轴

axis(side,...)

  • 1: 底部
  • 2: 左侧
  • 3: 顶部
  • 4: 右侧 axis一般需要和axes=F搭配使用
x <- 1:10
y <- rnorm(x)
par(mfrow=c(2,2))
plot(x,y,type="l",axes = F); axis(1)
plot(x,y,type="l",axes = F); axis(2)
plot(x,y,type="l",axes = F); axis(3)
plot(x,y,type="l",axes = F); axis(4)

22. xlim与ylim-坐标轴的取值范围

x <- c(1:10)
y <- sample(1:100,10)
par(mfrow=c(2,2))
plot(x,y)
plot(x,y,xlim=c(4,8))
plot(x,y,ylim=c(0,50))
plot(x,y,xlim = c(1,5),ylim = c(50,80))