Tuesday, September 24, 2013

LAB 01: Part 1 Answers Question 4:

Question 4
Answer part 1:
In Wikipedia, it is possible to see very easily how often an article has been edited. Once a user selects the View History tab in the upper right hand corner of the main page of the article, they can then select a link entitled: "Revision History Statistics". Once on this page the viewer can review statistics of the articles edits over various periods of time. e.g.:


  • The COBOL article has been edited as follows:
2012: 94 edits
2013: 67 edits

Total edits from 2012-2013= 161


  • The C++ article has been edited as follows:
2012: 31 edits
2013: 29 edits

Total edits from 2012-2013= 63

  • The Haskell article has been edited as follows:
2012:116 edits
2013: 51 edits

Total edits from 2012-2013= 167

  • The Java article has been edited as follows: 
2012: 292 edits
2013: 85 edits

Total edits from 2012-2013= 377


  • The Python article has been edited as follows:
2012: 49 edits
2013:25 edits

Total edits from 2012-2013= 74

Based on this information we can see that the most updated article of the four in the last two years at the time of writing is the Java article. This is followed by the Haskell article. The third most updated/edited article is the COBOL article, the fourth is the article on Python. The least most updated article of the four is the article on C++.

Answer part 2:


On Wikipedia there are several ways of seeing how "good" an article is. To start with, there is a "good article" accreditation that can be attributed to any article on the website. The article in question must meet the "good article criteria" and pass through a "good article nomination". When the article has passed this process and been deemed "good" it is given a small plus symbol surrounded by a circle on the right hand side of the page on the same line as the title of the article. 

For example, on the talk tabs of the articles for Java and python, we can see that the articles has a "good" status. However on the C++ talk tab we can see that the article once has a "good" rating but it has since been removed. The articles for Haskell and COBOL do not have a "good" rating and are given a "B" class rating in the Wiki Project of computer science.

Another way of determining if an article is good or not is to view the number of edits or talk subjects in the discussion/ talk tab. Although subjects in the talk tabs are not solely for the purpose of pointing out flaws in the articles, this is by and large usually the case. Therefore, one could judge that the fewer the number of discussion topics there are in the talk area, the more reliable and accurate the article is. In a similar way, if one were to review how many edits a Wikipedia page has undergone, one assumption we could make is that the original article may have had various false or inaccurate information. This however is not entirely reliable as often edits are required not when the information on the article is flawed but has been vandalized.  

Answer part 3:
As previously stated, edits are sometimes needed to articles when they have been vandalized. To estimate how much or often a article on Wikipedia has been vandalized, we can click on the "view history" tab on the top right hand side of the article page. This will bring us to a new page where we can view the edits which have been made to the article since its original writing. Following the details of when and who made the edits to the article, we can see how much the article has been altered in terms of bytes and also a brief description of the edit from the person who posted the edit. When we examine these numbers it is possible to see how much the website has been changed. 

Based on this information, it is apparent from viewing the edit histories of all 5 pages that the article on C++ is the most vandalized article. 

Answer part 4:
From studying the Contents of each Wikipedia article, we can see a few recurring titles which appear in all 5 articles. The most common title is all articles is "History" followed by a title based on "Criticism" After this we have recurring titles of "Features and Syntax" in several of the articles. Also recurring are the titles: "Reference", "Further Reading" and "External Links".

Answer part 5:
As already pointed out, In Wikipedia we can see how often an article has been edited in the "View History" tab. While we can see from the edit descriptions that often edits are repeated or undone by people, it is generally safe to assume that the more edits that the article has in its history, the more its current iteration differs from the original article posted. Therefore, we can see that based on the number of edits for all 5 articles in their histories, the article on Java is most different from its original posting. C++ is the article with the least edits so in terms of its content, it can be estimated that this article is the most similar to its original form. 

Another way of estimating how different an article is is by the size of the original article compared to its present day size in terms of bytes. In terms of difference in size of the articles in terms of bytes, Python is the most different as the original article was 1,801 bytes in size while the current article is 75,380 bytes at the time of writing. The least different article from its original is the article on COBOL which was 3,385 bytes originally and is 30,844 now at the time of writing.

No comments:

Post a Comment