File:NZ opinion polls 2014-2017-minorparties.png
Page contents not supported in other languages.
Appearance
NZ_opinion_polls_2014-2017-minorparties.png (778 × 487 pixels, file size: 16 KB, MIME type: image/png)
This is a file from the Wikimedia Commons. Information from its description page there is shown below. Commons is a freely licensed media file repository. You can help. |
Summary
DescriptionNZ opinion polls 2014-2017-minorparties.png |
English: Graph showing support for political parties in New Zealand since the 2014 election, according to various political polls. Data is obtained from the Wikipedia page, Opinion polling for the Next New Zealand general election |
Date | |
Source | Own work based on very very lightly modified R Code from File:NZ_opinion_polls_2011-2014-majorparties.png |
Author | Limegreen |
This file may be updated to reflect new information. If you wish to use a specific version of the file without it being overwritten, please upload the required version as a separate file. |
Figure is produced using the R statistical package, using the following code. It first reads the HTML directly from the website, then parses the data and saves the graph into your working directory. It should be able to be run directly by anyone with R.
rm(list=ls())
require(mgcv)
require(tidyverse)
#==========================================
#Parameters - specified as a list
opts <- list()
opts$major <- list(parties= c("Green","Labour","National","NZ First"), #use precise names from Table headers
ylims = c(0,65), #Vertical range
fname= "NZ_opinion_polls_2014-2017-majorparties.png",
dp=0) #Number of decimal places to round estimates to
opts$minor <- list(parties=c("ACT","Maori","United","Mana","Con", "TOP" #please use "Maori" for the Maori party
),
ylims = c(0,6), #Vertical range
fname = "NZ_opinion_polls_2014-2017-minorparties.png",
dp=1) #Number of decimal places to round estimates to
#==========================================
#Shouldn't need to edit anything below here
#==========================================
#Load the complete HTML file into memory
html <- readLines(url("https://en.wikipedia.org/wiki/Opinion_polling_for_the_New_Zealand_general_election,_2017",encoding="UTF-8"))
# html <- read_html("http://en.wikipedia.org/wiki/Opinion_polling_for_the_next_New_Zealand_general_election",encoding="UTF-8")
closeAllConnections()
#Extract the opinion poll data table
tbl.no <- 1
tbl <- html[(grep("<table.*",html)[tbl.no]):(grep("</table.*",html)[tbl.no])]
#Now split it into the rows, based on the <tr> tag
tbl.rows <- list()
open.tr <- grep("<tr",tbl)
close.tr <- grep("</tr",tbl)
for(i in 1:length(open.tr)) tbl.rows[[i]] <- tbl[open.tr[i]:close.tr[i]]
#Extract table headers
hdrs <- grep("<th",tbl,value=TRUE)
hdrs <- hdrs[1:(length(hdrs)/2 -10)]
party.names <- gsub("<.*?>","",hdrs)[-c(1:2)] %>% #nasty hack
gsub(" ","_",.) %>% #Replace space with a _
gsub("M.{1}ori","Maori",.) #Apologies, but the hard "a" is too hard to handle otherwise
# party.cols <- gsub("^.*bgcolor=\"(.*?)\".*$","\\1",hdrs)[-c(1:2)]
party.cols <- c("#00529F", "#D82A20", "#098137", "#000000", "#EF4A42",
"#FDE401", "#501557", "#00AEEF", "#770808", "#151A61")
names(party.cols) <- party.names
#Extract data rows
tbl.rows <- tbl.rows[sapply(tbl.rows,function(x) length(grep("<td",x)))>1]
###UGLY HACK
#party.names <- party.names[1:9]
#Now extract the data
survey.dat <- lapply(tbl.rows,function(x) {
#Start by only considering where we have <td> tags
td.tags <- x[grep("<td",x)]
#Polling data appears in columns other than first two
dat <- td.tags[-c(1,2)]
#Now strip the data and covert to numeric format
dat <- gsub("<td>|</td>|<b>|</b>|<td style=|background:#[0-9A-Z]{6}","",dat)
dat <- gsub("\"", "", dat)
dat <- gsub("%","",dat)
dat <- gsub("-","0",dat)
dat <- gsub("<|>","",dat)
dat <- as.numeric(dat)
if(length(dat)!=length(party.names)) {
stop(sprintf("Survey data is not defined properly: %s",td.tags[1]))
}
names(dat) <- party.names
#Getting the date strings is a little harder. Start by tidying up the dates
date.str <- td.tags[2] #Dates are in the second column
date.str <- gsub("<sup.*</sup>","",date.str) #Throw out anything between superscript tags, as its an reference to the source
date.str <- gsub("<td>|</td>","",date.str) #Throw out any tags
#Get numeric parts of string
digits.str <- gsub("[^0123456789]"," ",date.str)
digits.str <- gsub("^ +","",digits.str) #Drop leading whitespace
digits <- strsplit(digits.str," +")[[1]]
yrs <- grep("[0-9]{4}",digits,value=TRUE)
days <- digits[!digits%in%yrs]
#Get months
month.str <- gsub("[^A-Z,a-z]"," ",date.str)
month.str <- gsub("^ +","",month.str) #Drop leading whitespace
mnths <- strsplit(month.str," +",month.str)[[1]]
#Now paste together to make standardised date strings
days <- rep(days,length.out=2)
mnths <- rep(mnths,length.out=2)
yrs <- rep(yrs,length.out=2)
dates.std <- paste(days,mnths,yrs)
#And finally the survey time
survey.time <- mean(as.POSIXct(strptime(dates.std,format="%d %B %Y")))
#Get the name of the survey company too
survey.comp <- td.tags[1]
survey.comp <- gsub("<sup.*</sup>","",survey.comp)
survey.comp <- gsub("<td>|</td>","",survey.comp)
survey.comp <- gsub("<U+2013>","-",survey.comp,fixed=TRUE)
survey.comp <- gsub("(?U)<.*>","",survey.comp,perl=TRUE)
survey.comp <- gsub("^ +| +$","",survey.comp)
survey.comp <- gsub("-+"," ",survey.comp)
#And now return results
return(data.frame(Company=survey.comp,Date=survey.time,date.str,t(dat)))
})
#Combine results
surveys <- do.call(rbind,survey.dat)
##ugly date fix
surveys[26, 2] <- "2015-10-06 00:00:00"
surveys[29, 2] <- "2015-11-15 00:00:00"
#Ugly fix to remove Opportunities party while not enough data
# surveys <- select(surveys, -TOP)
#==========================================
#Now generate each plot
#==========================================
smoothers <- list()
for(opt in opts) {
#Restrict data to selected parties
selected.parties <- gsub(" ","_",sort(opt$parties))
selected.cols <- party.cols[selected.parties]
plt.dat <- surveys[,c("Company","Date",selected.parties)]
plt.dat <- subset(plt.dat,!is.na(surveys$Date))
plt.dat <- plt.dat[order(plt.dat$Date),]
plt.dat$date.num <- as.double(plt.dat$Date)
plt.dat <- subset(plt.dat,Company!="2008 election result")
plt.dat$Company <- factor(plt.dat$Company)
#Setup plot
ticks <- ISOdate(c(rep(2014,1),rep(2015,2),rep(2016,2),rep(2017,2),2018),c(rep(c(7,1),4)),1)
xlims <- range(c(ISOdate(2014,11,1),ticks))
png(opt$fname,width=778,height=487,pointsize=16)
par(mar=c(5.5,4,1,1))
matplot(plt.dat$date.num,plt.dat[,selected.parties],pch=NA,xlim=xlims,ylab="Party support (%)",
xlab="",col=selected.cols,xaxt="n",ylim=opt$ylims,yaxs="i")
abline(h=seq(0,95,by=5),col="lightgrey",lty=3)
abline(v=as.double(ticks),col="lightgrey",lty=3)
box()
axis(1,at=as.double(ticks),labels=format(ticks,format="1 %b\n%Y"),cex.axis=0.8)
axis(4,at=axTicks(4),labels=rep("",length(axTicks(4))))
smoothed <- list()
predict.x <- seq(min(surveys$Date),max(surveys$Date),length.out=100)
for(i in 1:length(selected.parties)) {
smoother <- loess(surveys[,selected.parties[i]] ~ as.numeric(surveys[,"Date"]),span=0.35)
smoothed[[i]] <- predict(smoother,newdata=predict.x,se=TRUE)
polygon(c(predict.x,rev(predict.x)),
c(smoothed[[i]]$fit+smoothed[[i]]$se.fit*1.96,rev(smoothed[[i]]$fit-smoothed[[i]]$se.fit*1.96)),
col=rgb(0.5,0.5,0.5,0.5),border=NA)
}
names(smoothed) <- selected.parties
#Then add the data points
matpoints(surveys$Date, surveys[,selected.parties],pch=20,col=selected.cols)
#And finally the smoothers themselves
for(i in 1:length(selected.parties)) {
lines(predict.x,smoothed[[i]]$fit,col=selected.cols[i],lwd=2)
}
# #Then add the data points
# matpoints(plt.dat$date.num,plt.dat[,selected.parties],pch=20,col=selected.cols)
# #And finally the smoothers themselves
# for(n in selected.parties) {
# lines(smoothed.l[[n]]$date,smoothed.l[[n]]$fit,col=selected.cols[n],lwd=2)
# }
n.parties <- length(selected.parties)
legend(grconvertX(0.5,"npc"),grconvertY(0.0,"ndc"),xjust=0.5,yjust=0,
legend=gsub("_"," ",selected.parties), col=selected.cols,
pch=20,bg="white",lwd=2,
ncol=ifelse(n.parties>4,ceiling(n.parties/2),n.parties),xpd=NA)
#Add best estimates
fmt.str <- sprintf("%%2.%if\261%%1.%if %%%%",opt$dp,opt$dp)
for(n in names(smoothed)) {
lbl <- sprintf(fmt.str,
round(rev(smoothed[[n]]$fit)[1],opt$dp),
round(1.96*rev(smoothed[[n]]$se.fit)[1],opt$dp))
text(rev(plt.dat$date.num)[1],rev(smoothed[[n]]$fit)[1],
labels=lbl,pos=4,col=selected.cols[n],xpd=NA)
}
dev.off()
}
#==========================================
#Finished!
#==========================================
cat("Complete.\n")
Licensing
I, the copyright holder of this work, hereby publish it under the following license:
This file is licensed under the Creative Commons Attribution-Share Alike 4.0 International license.
- You are free:
- to share – to copy, distribute and transmit the work
- to remix – to adapt the work
- Under the following conditions:
- attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- share alike – If you remix, transform, or build upon the material, you must distribute your contributions under the same or compatible license as the original.
Any autoconfirmed user can overwrite this file from the same source. Please ensure that overwrites comply with the guideline.
Items portrayed in this file
depicts
13 May 2016
File history
Click on a date/time to view the file as it appeared at that time.
Date/Time | Thumbnail | Dimensions | User | Comment | |
---|---|---|---|---|---|
current | 21:07, 21 September 2017 | 778 × 487 (16 KB) | Limegreen | add latest polls; change span to .24 | |
00:08, 17 September 2017 | 778 × 487 (27 KB) | Limegreen | new polls | ||
09:24, 14 September 2017 | 778 × 487 (15 KB) | Limegreen | add colmar brunton | ||
08:51, 12 September 2017 | 778 × 487 (28 KB) | Limegreen | add newshub latest | ||
00:51, 12 September 2017 | 778 × 487 (15 KB) | Limegreen | fix the error channel for conservatives | ||
01:28, 11 September 2017 | 778 × 487 (16 KB) | Limegreen | Switched to loess (span = .35) smoother, and added recent polls | ||
13:46, 28 August 2017 | 778 × 487 (21 KB) | Limegreen | add new polls | ||
11:47, 11 August 2017 | 778 × 487 (23 KB) | Limegreen | add new polls | ||
22:22, 31 July 2017 | 778 × 487 (22 KB) | Limegreen | Add Newshub Reid Research | ||
22:50, 30 July 2017 | 778 × 487 (22 KB) | Limegreen | add new colmar brunton poll. Change k value for smoother to 5 so that an estimate for TOP can be produced for the first time. |
File usage
The following page uses this file:
Global file usage
The following other wikis use this file:
- Usage on en.wikipedia.org
- Usage on fr.wikipedia.org
- Usage on ru.wikipedia.org