28/04 4

2019-04-28 19:04:40 +01:00 · 2019-04-28 19:04:40 +01:00 · bc76ef7791
commit bc76ef7791
parent 0ed655e9ad
7 changed files with 885 additions and 151 deletions
--- a/PID.txt
+++ b/PID.txt
@ -0,0 +1,722 @@
+Individual Project (CS3IP16)
+Department of Computer Science
+University of Reading
+
+Project Initiation Document
+PID Sign-Off
+Student No.
+
+24005432
+
+Student Name
+
+Andrew Sotheran
+
+Email
+
+andrew.sotheran@student.reading.ac.uk
+
+Degree programme
+(BSc CS/BSc IT)
+
+BSc CS
+
+Supervisor Name
+
+Kenneth Boness
+
+Supervisor Signature
+
+Date
+
+1
+
+SECTION 1 – General Information
+Project Identification
+1.1
+
+Project ID
+(as in handbook)
+N/A
+
+1.2
+
+Project Title
+
+Cryptocurrency market and value prediction tracking
+1.3
+
+Briefly describe the main purpose of the project in no more than 25 words
+
+To provide a means to predict the value of cryptocurrencies that will aid in investor decision making
+in investment of the market
+
+Student Identification
+1.4
+
+Student Name(s), Course, Email address(s)
+e.g. Anne Other, BSc CS, a.other@student.reading.ac.uk
+Andrew William Sotheran
+BSc CS
+Andrew.sotheran@student.reading.ac.uk
+
+Supervisor Identification
+1.5
+
+Primary Supervisor Name, Email address
+e.g. Prof Anne Other, a.other@reading.ac.uk
+
+1.6
+
+Secondary Supervisor Name, Email address
+Only fill in this section if a secondary supervisor has been assigned to your project
+
+Company Partner (only complete if there is a company involved)
+1.7
+
+Company Name
+
+N/A
+1.8
+
+Company Address
+N/A
+
+1.9
+
+Name, email and phone number of Company Supervisor or Primary Contact
+N/A
+
+2
+
+SECTION 2 – Project Description
+2.1
+
+Summarise the background research for the project in about 400 words. You must include
+references in this section but don’t count them in the word count.
+
+To create a tool that aims to predict the price of cryptocurrencies that aids in investor decisions.
+Research will need to be conducted into the following topics that surround data mining, machine
+learning and artificial neural networks.
+This research will consist along the lines of;
+Natural Language processing and analysis – To analyse and process fed in data gathered through RSS
+data feeds and social media feeds, through the underlying tasks of Natural language processing.
+Content categorisation (search and indexing, duplication detection), Topic discovery and modelling
+(Obtain meanings and themes within the data and perform analytic techniques), sentiment and
+semantic analysis (which will identify the mood and opinions within the data), summariser (to
+summarise a block of text and disregard the rest).
+Machine learning algorithms: The three types of machine learning (Supervised, Unsupervised and
+Reinforced)
+The types of common algorithms used, each of these will be researched to identify the most suitable
+for this project and only one will be used: (Linear Regression, Logistic Regression, Decision Tree,
+SVM, Naive Bayes, kNN, K-Means, Random Forest, Dimensionality Reduction Algorithms,
+Gradient Boosting algorithms (GBM, XGBoost, LightGBM, CatBoost).
+Artificial Neural Networks: To identify the drawbacks and benefits of using them or other
+computational models within machine learning. Recurrent Neural networks and 3rd generation
+Neural Networks.
+Data mining: To investigate the different techniques and algorithms used (Same as the ones listed
+above for machine learning including C4.5, Apriori, EM, PageRanks, AdaBoost and CART) these
+will be researched and the most appropriate identified.
+To investigate techniques: for storing and processing large amount of data, such as Hadoop,
+Elasticsearch utilities, Graphing and data modelling and visualisation.
+To identify appropriate libraries for python or C for each of the topics above to aid in the creation of
+this project. Libraries such as:
+Natural Language Toolkit (NLTK) – python
+Pandas - python
+Sklearn - python
+Numpy – python - scientific computation for working with arrays
+Matplotlib - python - data visualisation
+Investigate into types of databases. Sql and nosql for a storage medium between receiving data and
+feeding it into the machine learning algorithm.
+Investigate into the use of REST API and other web-service based technologies (GRPC,
+Elasticsearch)
+Investigate into frameworks for the thin client, such as Angular vs React, Nodejs, Leafelt.js, charts.js
+Additionally Web scraping may be needed if certain website that don’t either have an API or JSON
+for the data needed.
+https://www.sas.com/en_gb/insights/analytics/what-is-natural-language-processing-nlp.html
+https://blog.algorithmia.com/introduction-natural-language-processing-nlp/
+https://gerardnico.com/data_mining/algorithm
+https://www.analyticsvidhya.com/blog/2017/09/common-machine-learning-algorithms/
+https://www.kdnuggets.com/2015/05/top-10-data-mining-algorithms-explained.html
+https://www.datasciencecentral.com/profiles/blogs/artificial-neural-network-ann-in-machine-learning
+http://scikit-learn.org/stable/index.html
+https://grpc.io/docs/
+
+3
+
+2.2
+
+Summarise the project objectives and outputs in about 400 words.
+These objectives and outputs should appear as tasks, milestones and deliverables in your project plan.
+In general, an objective is something you can do and an output is something you produce – one leads
+to the other.
+To produce a thin web client that provides a dashboard that provides tangible and useful information
+to users such as; current price of a cryptocurrency (Updated every 5 minutes), exchange rates, network hashrates,
+historical price data. It will also display statistics about sentiment analysis conducted on social media
+about the currency, graphical predictions on what the price may be, in a given time, and will also
+compare this to other currencies for aid in investment.
+To produce significant research into the topics in and around data mining, machine learning and
+Artificial Neural network and the underlying tasks and algorithms used, the efficiency, drawbacks
+and advantages of each to identify the most suitable for the use in this project.
+To produce a system that analyses a data set obtained through social media feeds and posts on news
+sites regarding crypto currencies. It should perform sentiment analysis using Natural Language
+processing and analysis techniques to identify features and identifies the type of sentiment in the data
+and categorises it for machine learning.
+To utilise machine learning techniques and algorithms to produce a system that learns from historical
+data to predict to an extent the possible future price of a given currency. To compare this with the use
+of an Artificial Neural Network and to analyse the drawbacks of both.
+
+2.3
+
+Initial project specification - list key features and functions of your finished project.
+Remember that a specification should not usually propose the solution. For example, your project
+may require open source datasets so add that to the specification but don’t state how that data-link
+will be achieved – that comes later.
+The finished project should provide a thin client single page application. This will provide a means to
+users the ability to view various statistics on crypto currencies on a dashboard that incorporates text
+analysis through natural language analysis, and will utilise various machine learning and data mining
+techniques to provide price predictions to the users. The nature and level of this will depend on the
+research conducted into the areas of data mining, machine learning, natural language processing and
+artificial neural networks, along with the algorithms used.
+The data set will be created from scratch for this project as it will require the gathering of data from
+numerous sources and performing text analysis on them to for the data needed. Data sets for the
+characteristic and data for the currencies can be obtained from pre-existing data sets such as:
+https://www.kaggle.com/sudalairajkumar/cryptocurrencypricehistory
+https://www.kaggle.com/jessevent/all-crypto-currencies
+Web scraping may be included if certain news/social media websites do not provide an API or RSS
+feed for the analysis engine to perform text analysis on
+Additionally, there will be a server between the analysis/prediction engine and the thin client that will
+maintain a database, either SQL or NoSQL, that will hold statistics about the currencies and data
+about the price predictions about the currencies. It will not hold any of the data used in the analysis
+engine, as this database will only hold data available to the end users.
+
+4
+
+2.4
+
+Describe the social, legal and ethical issues that apply to your project. Does your project
+require ethical approval? (If your project requires a questionnaire/interview for conducting
+research and/or collecting data, you will need to apply for an ethical approval)
+The project will not be handling any user related data, therefore it does not need ethical approval.
+
+2.5
+
+Identify and lists the items you expect to need to purchase for your project. Specify the cost
+(include VAT and shipping if known) of each item as well as the supplier.
+e.g. item 1 name, supplier, cost
+
+None Needed
+
+2.6
+
+State whether you need access to specific resources within the department or the University e.g.
+special devices and workshop
+
+Possibly a server to host the database and analysis engine on to perform the computation necessary,
+and a server to host the thin client.
+
+5
+
+SECTION 3 – Project Plan
+3.1
+
+Project Plan
+Split your project work into sections/categories/phases and add tasks for each of these sections. It is
+likely that the high-level objectives you identified in section 2.2 become sections here. The outputs from
+section 2.2 should appear in the Outputs column here. Remember to include tasks for your project
+presentation, project demos, producing your poster, and writing up your report.
+
+Task No.
+
+Task description
+
+1
+
+Background Research
+
+1.1
+
+Investigate into RPC frameworks and REST APIs
+
+0.3
+
+1.2
+
+Research into Natural Language processing and analysis
+techniques
+
+0.5
+
+1.3
+
+Research into the use of machine learning – types and
+algorithms
+
+0.5
+
+1.4
+
+Research into the application of Neural Networks –
+drawbacks and advantages of using them
+
+0.3
+
+1.5
+
+Research techniques for storing and processing large
+amount of data, such as Hadoop, spark or Elasticsearch
+utilities.
+
+1
+
+1.6
+
+Identify appropriate libraries for data modelling and
+visualisation, NLP and Machine Learning
+
+1
+
+1.7
+
+Investigate into frameworks for the front-end thin clients
+
+0.3
+
+1.8
+
+Research web scraping techniques
+
+0.3
+
+2
+2.1
+
+Analysis and design
+Resolve issues discovered by background research
+Identify limitations discovered from research and
+what is not feasible
+UML Diagrams/ XUML
+Wire frames for frontend
+Data Flow
+User Flow
+Develop prototype
+Develop thin client
+Develop analysis Engine
+Develop Prediction Engine
+Develop Unit tests
+Testing, evaluation/validation
+Unit testing
+Acceptance Testing
+User testing
+Assessments
+write-up project report
+produce poster
+Log book
+
+2.2
+2.3
+2.4
+2.5
+2.6
+3
+3.1
+3.2
+3.3
+3.4
+4
+4.1
+4.2
+4.3
+5
+5.1
+5.2
+5.3
+
+Effort
+(weeks)
+
+6
+
+Outputs
+
+To identify the type of API/RPC
+framework that would be most
+suitable
+To get an understanding of how
+NLP works and how it could be
+used
+To grasp how ML paradigms work
+and how this project will use it
+To identify whether there will be a
+need for a neural network or ML
+paradigms can be used instead
+To understand the uses, application
+and whether the use of these are
+more viable solution than standard
+ML practices
+To identify what libraries will aid in
+the construction of this project
+To identify what frameworks the
+thin client should be used with,
+along with drawbacks and
+advantages
+To understand the application of
+these techniques and learn how to
+apply them
+
+0.2
+
+...
+
+0.1
+
+…
+
+0.2
+0.1
+0.1
+0.1
+2
+4
+3
+2
+1
+0.8
+0.8
+2
+0.5
+0.5
+
+Project Report
+Poster
+
+TOTAL
+
+Sum of total effort in weeks
+
+7
+
+21.9
+
+SECTION 4 - Time Plan for the proposed Project work
+For each task identified in 3.1, please shade the weeks when you’ll be working on that task. You should also mark target milestones, outputs and key decision points.
+To shade a cell in MS Word, move the mouse to the top left of cell until the curser becomes an arrow pointing up, left click to select the cell and then right click and
+select ‘borders and shading’. Under the shading tab pick an appropriate grey colour and click ok.
+START DATE: 10/2018
+
+<enter the project start date here>
+
+Project Weeks
+0-3
+3-6
+
+9-12
+
+6-9
+
+12-15
+
+Project stage
+1 Background Research
+Investigate into RPC frameworks and REST
+APIs
+
+Research into Natural Language processing
+and analysis techniques
+Research into the use of machine learning –
+types and algorithms
+Research into the application of Neural
+Networks – drawbacks and advantages of
+Research techniques for storing and
+using them
+processing large amount of data, such as
+Identify appropriate libraries for data
+Hadoop, spark or Elasticsearch utilities.
+modelling and visualisation, NLP and
+Investigate into frameworks for the frontMachine Learning
+end thin clients
+Research web scraping techniques
+2 Analysis/Design
+Resolve issues discovered by background
+research
+Identify limitations discovered from
+research and what is not feasible
+UML Diagrams/ XUML
+Wire frames for frontend
+Data Flow
+User Flow
+
+8
+
+15-18
+
+18-21
+
+21-24
+
+24-27
+
+27-30
+
+30-33
+
+33-36
+
+36-39
+
+3 Develop prototype.
+Develop thin client
+Develop analysis Engine
+Develop Prediction Engine
+Develop Unit tests
+4 Testing, evaluation/validation
+Unit testing
+Acceptance Testing
+User testing
+5 Assessments
+write-up project report
+produce poster
+Log book
+
+9
+
+RISK ASSESSMENT FORM
+Assessment Reference No.
+
+Area or activity
+assessed:
+
+Assessment date
+Persons who may be affected by
+the activity (i.e. are at risk)
+
+Andrew Sotheran
+
+SECTION 1: Identify Hazards - Consider the activity or work area and identify if any of the hazards listed below are significant (tick the boxes that apply).
+
+1.
+
+2.
+
+Fall of person (from
+work at height)
+Fall of objects
+
+3.
+
+Slips, Trips &
+Housekeeping
+
+4.
+
+Manual handling
+operations
+
+5.
+
+Display screen
+5
+equipment
+5
+
+6.
+
+7.
+
+Lighting levels
+
+Heating &
+ventilation
+
+11.
+
+Use of portable
+tools / equipment
+
+12.
+
+Fixed machinery or
+lifting equipment
+
+Layout , storage,
+
+X 8. space, obstructions
+9.
+
+13.
+
+Welfare facilities
+Electrical
+
+X 10. Equipment
+
+14.
+
+X
+
+15.
+
+Pressure vessels
+
+Noise or Vibration
+
+16.
+
+21.
+
+17.
+
+Outdoor work /
+extreme weather
+
+22.
+
+Hazardous
+biological agent
+
+27.
+
+18.
+
+Fieldtrips / field
+work
+
+23.
+
+Confined space /
+asphyxiation risk
+
+28.
+
+24.
+
+Condition of
+Buildings & glazing
+
+29.
+
+19.
+
+Fire hazards &
+flammable material
+
+10
+
+Hazardous fumes,
+
+Vehicles / driving
+at work
+
+20.
+
+Radiation sources
+
+Work with lasers
+
+25.
+
+chemicals, dust
+
+Food preparation
+
+26.
+
+Occupational stress
+
+Violence to staff /
+verbal assault
+Work with animals
+Lone working /
+work out of hours
+Other(s) - specify
+
+30.
+
+X
+
+SECTION 2: Risk Controls - For each hazard identified in Section 1, complete Section 2.
+Hazard
+No.
+
+3
+
+5
+
+Hazard Description
+
+Tripping over wires
+
+Eye strain from
+looking at a
+monitor
+
+Existing controls to reduce risk
+
+Risk Level (tick one)
+
+Further action needed to reduce risks
+
+High
+
+(provide timescales and initials of person responsible)
+
+Med
+
+Cable management is at a minimum, none are
+currently properly cable managed and kept out
+of way
+Current screen contrast and brightness is
+acceptable
+
+x
+
+x
+
+SIGNED
+
+Name of Assessor(s)
+Review date
+
+11
+
+Low
+
+Sufficient cable management needed, cables
+tied together and moved out of way of feet
+
+To have periodic breaks from the screen
+
+Health and Safety Risk Assessments – continuation sheet
+
+Assessment Reference No
+Continuation sheet number:
+
+SECTION 2 continued: Risk Controls
+Hazard
+No.
+
+Hazard Description
+
+Existing controls to reduce risk
+
+Risk Level (tick one)
+
+Further action needed to reduce risks
+
+High
+
+(provide timescales and initials of person responsible for
+action)
+
+Med
+
+SIGNED
+
+Name of Assessor(s)
+Review date
+
+12
+
+Low
+
+
--- a/document.aux
+++ b/document.aux
@ -294,9 +294,9 @@
 \@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {section}{\numberline {13}Social, Legal and Ethical Issues}{88}{section.13}}
 \@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {section}{\numberline {14}Conclusion and Future Improvements}{89}{section.14}}
 \@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {14.1}Conclusion}{89}{subsection.14.1}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {14.2}Future Improvements}{89}{subsection.14.2}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {section}{\numberline {15}Appendices}{95}{section.15}}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {15.1}Appendix A - Project Initiation Document}{95}{subsection.15.1}}
+\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {14.2}Future Improvements}{90}{subsection.14.2}}
+\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {section}{\numberline {15}Appendices}{97}{section.15}}
+\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {15.1}Appendix A - Project Initiation Document}{97}{subsection.15.1}}
 \abx@aux@refcontextdefaultsdone
 \abx@aux@defaultrefcontext{0}{SaTdpsmm}{none/global//global/global}
 \abx@aux@defaultrefcontext{0}{nlAeiBTCPSO}{none/global//global/global}
@ -349,4 +349,4 @@
 \abx@aux@defaultrefcontext{0}{SpamOrHamGit}{none/global//global/global}
 \abx@aux@defaultrefcontext{0}{MBE}{none/global//global/global}
 \abx@aux@defaultrefcontext{0}{TwitterTerms}{none/global//global/global}
-\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {15.2}Appendix B - Log book}{108}{subsection.15.2}}
+\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {15.2}Appendix B - Log book}{110}{subsection.15.2}}
--- a/document.log
+++ b/document.log
@ -1,4 +1,4 @@
-This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017/Debian) (preloaded format=pdflatex 2018.10.16)  28 APR 2019 17:50
+This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017/Debian) (preloaded format=pdflatex 2018.10.16)  28 APR 2019 19:03
 entering extended mode
 restricted \write18 enabled.
 %&-line parsing enabled.
@ -1237,56 +1237,56 @@ Overfull \hbox (14.37637pt too wide) in paragraph at lines 1746--1747
 []

 [81] [82 <./images/with_sentiment.png> <./images/without_sentiment.png>]
-[83] [84] [85] [86] [87] [88] [89]
-Overfull \hbox (40.38213pt too wide) in paragraph at lines 1867--1867
+[83] [84] [85] [86] [87] [88] [89] [90] [91]
+Overfull \hbox (40.38213pt too wide) in paragraph at lines 1879--1879
 \T1/cmr/m/n/12 works,'' To-wards Data Sci-ence, 2018. [On-line]. Avail-able: []
 $\T1/cmtt/m/n/12 https : / / towardsdatascience .
 []

-[90]
-Overfull \hbox (83.66737pt too wide) in paragraph at lines 1867--1867
+[92]
+Overfull \hbox (83.66737pt too wide) in paragraph at lines 1879--1879
 \T1/cmr/m/n/12 works,'' Ma-chine Larn-ing Mas-tery, 2017. [On-line]. Avail-able
 : []$\T1/cmtt/m/n/12 https : / / machinelearningmastery .
 []


-Overfull \hbox (28.45175pt too wide) in paragraph at lines 1867--1867
+Overfull \hbox (28.45175pt too wide) in paragraph at lines 1879--1879
 \T1/cmr/m/n/12 lem,'' Su-per Data Sci-ence, 2018. [On-line]. Avail-able: []$\T1
 /cmtt/m/n/12 https : / / www . superdatascience .
 []

-[91]
-Overfull \hbox (7.75049pt too wide) in paragraph at lines 1867--1867
+[93]
+Overfull \hbox (7.75049pt too wide) in paragraph at lines 1879--1879
 \T1/cmr/m/n/12 2019. [On-line]. Avail-able: []$\T1/cmtt/m/n/12 https : / / medi
 um . com / datadriveninvestor / overview -[]
 []

-[92]
-Overfull \hbox (7.25049pt too wide) in paragraph at lines 1867--1867
+[94]
+Overfull \hbox (7.25049pt too wide) in paragraph at lines 1879--1879
 \T1/cmr/m/n/12 2017. [On-line]. Avail-able: []$\T1/cmtt/m/n/12 https : / / www 
 . statisticshowto . datasciencecentral .
 []


-Overfull \hbox (9.24751pt too wide) in paragraph at lines 1867--1867
+Overfull \hbox (9.24751pt too wide) in paragraph at lines 1879--1879
 \T1/cmr/m/n/12 [On-line]. Avail-able: []$\T1/cmtt/m/n/12 http : / / blog . alej
 andronolla . com / 2013 / 05 / 15 / detecting -[]
 []


-Overfull \hbox (0.88026pt too wide) in paragraph at lines 1867--1867
+Overfull \hbox (0.88026pt too wide) in paragraph at lines 1879--1879
 []\T1/cmr/m/n/12 P. Cryp-tog-ra-phy, ``A tu-to-rial on au-to-matic lan-guage id
 en-ti-fi-ca-tion - ngram based,''
 []

-[93] [94]
+[95] [96]

 pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
 rsion <1.7>, but at most version <1.5> allowed
-<PID.pdf, id=1794, 597.55246pt x 845.07718pt>
+<PID.pdf, id=1802, 597.55246pt x 845.07718pt>
 File: PID.pdf Graphic file (type pdf)
 <use PID.pdf>
-Package pdftex.def Info: PID.pdf  used on input line 1872.
+Package pdftex.def Info: PID.pdf  used on input line 1884.
 (pdftex.def)             Requested size: 597.551pt x 845.07512pt.


@ -1294,7 +1294,7 @@ pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
 rsion <1.7>, but at most version <1.5> allowed
 File: PID.pdf Graphic file (type pdf)
 <use PID.pdf>
-Package pdftex.def Info: PID.pdf  used on input line 1872.
+Package pdftex.def Info: PID.pdf  used on input line 1884.
 (pdftex.def)             Requested size: 597.551pt x 845.07512pt.


@ -1304,222 +1304,222 @@ rsion <1.7>, but at most version <1.5> allowed

 pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
 rsion <1.7>, but at most version <1.5> allowed
-<PID.pdf, id=1797, page=1, 597.55246pt x 845.07718pt>
+<PID.pdf, id=1805, page=1, 597.55246pt x 845.07718pt>
 File: PID.pdf Graphic file (type pdf)
 <use PID.pdf, page 1>
-Package pdftex.def Info: PID.pdf , page1 used on input line 1872.
+Package pdftex.def Info: PID.pdf , page1 used on input line 1884.
 (pdftex.def)             Requested size: 597.551pt x 845.07512pt.
 File: PID.pdf Graphic file (type pdf)
 <use PID.pdf, page 1>
-Package pdftex.def Info: PID.pdf , page1 used on input line 1872.
+Package pdftex.def Info: PID.pdf , page1 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
-[95]
+[97]
 File: PID.pdf Graphic file (type pdf)
 <use PID.pdf, page 1>
-Package pdftex.def Info: PID.pdf , page1 used on input line 1872.
+Package pdftex.def Info: PID.pdf , page1 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 File: PID.pdf Graphic file (type pdf)
 <use PID.pdf, page 1>
-Package pdftex.def Info: PID.pdf , page1 used on input line 1872.
+Package pdftex.def Info: PID.pdf , page1 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 File: PID.pdf Graphic file (type pdf)
 <use PID.pdf, page 1>
-Package pdftex.def Info: PID.pdf , page1 used on input line 1872.
-(pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
- [96 <./PID.pdf>]
-
-pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
-rsion <1.7>, but at most version <1.5> allowed
-<PID.pdf, id=1827, page=2, 597.55246pt x 845.07718pt>
-File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 2>
-Package pdftex.def Info: PID.pdf , page2 used on input line 1872.
-(pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
-File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 2>
-Package pdftex.def Info: PID.pdf , page2 used on input line 1872.
-(pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
-File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 2>
-Package pdftex.def Info: PID.pdf , page2 used on input line 1872.
-(pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
-[97 <./PID.pdf>]
-
-pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
-rsion <1.7>, but at most version <1.5> allowed
-<PID.pdf, id=1834, page=3, 597.55246pt x 845.07718pt>
-File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 3>
-Package pdftex.def Info: PID.pdf , page3 used on input line 1872.
-(pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
-File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 3>
-Package pdftex.def Info: PID.pdf , page3 used on input line 1872.
-(pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
-File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 3>
-Package pdftex.def Info: PID.pdf , page3 used on input line 1872.
+Package pdftex.def Info: PID.pdf , page1 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 [98 <./PID.pdf>]

 pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
 rsion <1.7>, but at most version <1.5> allowed
-<PID.pdf, id=1848, page=4, 597.55246pt x 845.07718pt>
+<PID.pdf, id=1836, page=2, 597.55246pt x 845.07718pt>
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 4>
-Package pdftex.def Info: PID.pdf , page4 used on input line 1872.
+<use PID.pdf, page 2>
+Package pdftex.def Info: PID.pdf , page2 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 4>
-Package pdftex.def Info: PID.pdf , page4 used on input line 1872.
+<use PID.pdf, page 2>
+Package pdftex.def Info: PID.pdf , page2 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 4>
-Package pdftex.def Info: PID.pdf , page4 used on input line 1872.
+<use PID.pdf, page 2>
+Package pdftex.def Info: PID.pdf , page2 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 [99 <./PID.pdf>]

 pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
 rsion <1.7>, but at most version <1.5> allowed
-<PID.pdf, id=1854, page=5, 597.55246pt x 845.07718pt>
+<PID.pdf, id=1842, page=3, 597.55246pt x 845.07718pt>
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 5>
-Package pdftex.def Info: PID.pdf , page5 used on input line 1872.
+<use PID.pdf, page 3>
+Package pdftex.def Info: PID.pdf , page3 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 5>
-Package pdftex.def Info: PID.pdf , page5 used on input line 1872.
+<use PID.pdf, page 3>
+Package pdftex.def Info: PID.pdf , page3 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 5>
-Package pdftex.def Info: PID.pdf , page5 used on input line 1872.
+<use PID.pdf, page 3>
+Package pdftex.def Info: PID.pdf , page3 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 [100 <./PID.pdf>]

 pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
 rsion <1.7>, but at most version <1.5> allowed
-<PID.pdf, id=1860, page=6, 597.55246pt x 845.07718pt>
+<PID.pdf, id=1856, page=4, 597.55246pt x 845.07718pt>
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 6>
-Package pdftex.def Info: PID.pdf , page6 used on input line 1872.
+<use PID.pdf, page 4>
+Package pdftex.def Info: PID.pdf , page4 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 6>
-Package pdftex.def Info: PID.pdf , page6 used on input line 1872.
+<use PID.pdf, page 4>
+Package pdftex.def Info: PID.pdf , page4 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 6>
-Package pdftex.def Info: PID.pdf , page6 used on input line 1872.
+<use PID.pdf, page 4>
+Package pdftex.def Info: PID.pdf , page4 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 [101 <./PID.pdf>]

 pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
 rsion <1.7>, but at most version <1.5> allowed
-<PID.pdf, id=1866, page=7, 597.55246pt x 845.07718pt>
+<PID.pdf, id=1862, page=5, 597.55246pt x 845.07718pt>
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 7>
-Package pdftex.def Info: PID.pdf , page7 used on input line 1872.
+<use PID.pdf, page 5>
+Package pdftex.def Info: PID.pdf , page5 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 7>
-Package pdftex.def Info: PID.pdf , page7 used on input line 1872.
+<use PID.pdf, page 5>
+Package pdftex.def Info: PID.pdf , page5 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 7>
-Package pdftex.def Info: PID.pdf , page7 used on input line 1872.
+<use PID.pdf, page 5>
+Package pdftex.def Info: PID.pdf , page5 used on input line 1884.
 (pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 [102 <./PID.pdf>]

 pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
 rsion <1.7>, but at most version <1.5> allowed
-<PID.pdf, id=1872, page=8, 845.07718pt x 597.55246pt>
+<PID.pdf, id=1868, page=6, 597.55246pt x 845.07718pt>
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 8>
-Package pdftex.def Info: PID.pdf , page8 used on input line 1872.
-(pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
+<use PID.pdf, page 6>
+Package pdftex.def Info: PID.pdf , page6 used on input line 1884.
+(pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 8>
-Package pdftex.def Info: PID.pdf , page8 used on input line 1872.
-(pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
+<use PID.pdf, page 6>
+Package pdftex.def Info: PID.pdf , page6 used on input line 1884.
+(pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 8>
-Package pdftex.def Info: PID.pdf , page8 used on input line 1872.
-(pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
+<use PID.pdf, page 6>
+Package pdftex.def Info: PID.pdf , page6 used on input line 1884.
+(pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 [103 <./PID.pdf>]

 pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
 rsion <1.7>, but at most version <1.5> allowed
-<PID.pdf, id=1882, page=9, 845.07718pt x 597.55246pt>
+<PID.pdf, id=1875, page=7, 597.55246pt x 845.07718pt>
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 9>
-Package pdftex.def Info: PID.pdf , page9 used on input line 1872.
-(pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
+<use PID.pdf, page 7>
+Package pdftex.def Info: PID.pdf , page7 used on input line 1884.
+(pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 9>
-Package pdftex.def Info: PID.pdf , page9 used on input line 1872.
-(pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
+<use PID.pdf, page 7>
+Package pdftex.def Info: PID.pdf , page7 used on input line 1884.
+(pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 9>
-Package pdftex.def Info: PID.pdf , page9 used on input line 1872.
-(pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
+<use PID.pdf, page 7>
+Package pdftex.def Info: PID.pdf , page7 used on input line 1884.
+(pdftex.def)             Requested size: 562.1644pt x 795.0303pt.
 [104 <./PID.pdf>]

 pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
 rsion <1.7>, but at most version <1.5> allowed
-<PID.pdf, id=1893, page=10, 845.07718pt x 597.55246pt>
+<PID.pdf, id=1881, page=8, 845.07718pt x 597.55246pt>
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 10>
-Package pdftex.def Info: PID.pdf , page10 used on input line 1872.
+<use PID.pdf, page 8>
+Package pdftex.def Info: PID.pdf , page8 used on input line 1884.
 (pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 10>
-Package pdftex.def Info: PID.pdf , page10 used on input line 1872.
+<use PID.pdf, page 8>
+Package pdftex.def Info: PID.pdf , page8 used on input line 1884.
 (pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 10>
-Package pdftex.def Info: PID.pdf , page10 used on input line 1872.
+<use PID.pdf, page 8>
+Package pdftex.def Info: PID.pdf , page8 used on input line 1884.
 (pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
 [105 <./PID.pdf>]

 pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
 rsion <1.7>, but at most version <1.5> allowed
-<PID.pdf, id=1905, page=11, 845.07718pt x 597.55246pt>
+<PID.pdf, id=1891, page=9, 845.07718pt x 597.55246pt>
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 11>
-Package pdftex.def Info: PID.pdf , page11 used on input line 1872.
+<use PID.pdf, page 9>
+Package pdftex.def Info: PID.pdf , page9 used on input line 1884.
 (pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 11>
-Package pdftex.def Info: PID.pdf , page11 used on input line 1872.
+<use PID.pdf, page 9>
+Package pdftex.def Info: PID.pdf , page9 used on input line 1884.
 (pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 11>
-Package pdftex.def Info: PID.pdf , page11 used on input line 1872.
+<use PID.pdf, page 9>
+Package pdftex.def Info: PID.pdf , page9 used on input line 1884.
 (pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
 [106 <./PID.pdf>]

 pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
 rsion <1.7>, but at most version <1.5> allowed
-<PID.pdf, id=1911, page=12, 845.07718pt x 597.55246pt>
+<PID.pdf, id=1901, page=10, 845.07718pt x 597.55246pt>
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 12>
-Package pdftex.def Info: PID.pdf , page12 used on input line 1872.
+<use PID.pdf, page 10>
+Package pdftex.def Info: PID.pdf , page10 used on input line 1884.
 (pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 12>
-Package pdftex.def Info: PID.pdf , page12 used on input line 1872.
+<use PID.pdf, page 10>
+Package pdftex.def Info: PID.pdf , page10 used on input line 1884.
 (pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
 File: PID.pdf Graphic file (type pdf)
-<use PID.pdf, page 12>
-Package pdftex.def Info: PID.pdf , page12 used on input line 1872.
+<use PID.pdf, page 10>
+Package pdftex.def Info: PID.pdf , page10 used on input line 1884.
 (pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
 [107 <./PID.pdf>]
-Package atveryend Info: Empty hook `BeforeClearDocument' on input line 1877.
- [108]
-Package atveryend Info: Empty hook `AfterLastShipout' on input line 1877.
+
+pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
+rsion <1.7>, but at most version <1.5> allowed
+<PID.pdf, id=1913, page=11, 845.07718pt x 597.55246pt>
+File: PID.pdf Graphic file (type pdf)
+<use PID.pdf, page 11>
+Package pdftex.def Info: PID.pdf , page11 used on input line 1884.
+(pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
+File: PID.pdf Graphic file (type pdf)
+<use PID.pdf, page 11>
+Package pdftex.def Info: PID.pdf , page11 used on input line 1884.
+(pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
+File: PID.pdf Graphic file (type pdf)
+<use PID.pdf, page 11>
+Package pdftex.def Info: PID.pdf , page11 used on input line 1884.
+(pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
+[108 <./PID.pdf>]
+
+pdfTeX warning: /usr/bin/pdflatex (file ./PID.pdf): PDF inclusion: found PDF ve
+rsion <1.7>, but at most version <1.5> allowed
+<PID.pdf, id=1919, page=12, 845.07718pt x 597.55246pt>
+File: PID.pdf Graphic file (type pdf)
+<use PID.pdf, page 12>
+Package pdftex.def Info: PID.pdf , page12 used on input line 1884.
+(pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
+File: PID.pdf Graphic file (type pdf)
+<use PID.pdf, page 12>
+Package pdftex.def Info: PID.pdf , page12 used on input line 1884.
+(pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
+File: PID.pdf Graphic file (type pdf)
+<use PID.pdf, page 12>
+Package pdftex.def Info: PID.pdf , page12 used on input line 1884.
+(pdftex.def)             Requested size: 795.0303pt x 562.1644pt.
+[109 <./PID.pdf>]
+Package atveryend Info: Empty hook `BeforeClearDocument' on input line 1889.
+ [110]
+Package atveryend Info: Empty hook `AfterLastShipout' on input line 1889.
 (./document.aux)
-Package atveryend Info: Executing hook `AtVeryEndDocument' on input line 1877.
-Package atveryend Info: Executing hook `AtEndAfterFileList' on input line 1877.
+Package atveryend Info: Executing hook `AtVeryEndDocument' on input line 1889.
+Package atveryend Info: Executing hook `AtEndAfterFileList' on input line 1889.

 Package rerunfilecheck Info: File `document.out' has not changed.
 (rerunfilecheck)             Checksum: 1D7B2504DFF5D56ABCCDF1948D08498A;14207.
@ -1528,8 +1528,8 @@ Package logreq Info: Writing requests to 'document.run.xml'.

 ) 
 Here is how much of TeX's memory you used:
- 25151 strings out of 492982
- 396355 string characters out of 6134895
+ 25153 strings out of 492982
+ 396371 string characters out of 6134895
 1018656 words of memory out of 5000000
 27463 multiletter control sequences out of 15000+600000
 21245 words of font info for 60 fonts, out of 8000000 for 9000
@ -1553,10 +1553,10 @@ ic/cm-super/sfrm0600.pfb></usr/share/texmf/fonts/type1/public/cm-super/sfrm1000
 mf/fonts/type1/public/cm-super/sfrm1440.pfb></usr/share/texmf/fonts/type1/publi
 c/cm-super/sfrm2488.pfb></usr/share/texmf/fonts/type1/public/cm-super/sfti1200.
 pfb></usr/share/texmf/fonts/type1/public/cm-super/sftt1200.pfb>
-Output written on document.pdf (108 pages, 2423767 bytes).
+Output written on document.pdf (110 pages, 2428026 bytes).
 PDF statistics:
- 2175 PDF objects out of 2487 (max. 8388607)
- 1980 compressed objects within 20 object streams
- 886 named destinations out of 1000 (max. 500000)
+ 2185 PDF objects out of 2487 (max. 8388607)
+ 1988 compressed objects within 20 object streams
+ 888 named destinations out of 1000 (max. 500000)
 855 words of extra memory for PDF output out of 10000 (max. 10000000)

--- a/document.pdf
+++ b/document.pdf
--- a/document.synctex.gz
+++ b/document.synctex.gz
--- a/document.tex
+++ b/document.tex
@ -683,7 +683,7 @@
 		The initial PID did, however, give an initial basis to base ideas and initial research from and was the beginning drive of this project.
 		
 		\subsection{Solution Summary}\label{summary}
-		The overall solution, concerning the problem statement, is to create a system mainly consisting of; a frontend application that will display plotting, predicted and true, performance metric data to the user as a clear and concise form. The backend system behind the price forecasting will consist of various subsystem responsible for data collection, filtering, data pre-processing, sentiment analysis, network training, validation and training and future price predictions. Each stage will consist of relevant tools and techniques for performing their required task.
+		The overall solution, concerning the problem statement, is to create a system mainly consisting of; a frontend application that will display plotted; predicted and true, performance metric data to the user as a clear and concise form. The backend system behind the price forecasting will consist of various subsystem responsible for data collection, filtering, data pre-processing, sentiment analysis, network training, validation and training and future price predictions. Each stage will consist of relevant tools and techniques for performing their required task.
 		
 		\newpage
 		
@ -1799,8 +1799,8 @@ def create_sets(self, data, lookback, sentiment):
 	
 	Lastly, a limitation that could be identified and is also discussed in the results section above is that of the performance metrics not showing a clear distinction between the two network models. This limitation could be overcome by using more suitable explanative metrics, rather than relying on a more visual inspection, such as: 
 	\begin{itemize}
-		\item Adjusted $R^2$ statistic - which shows how well the selected independant variables of the model explain the variability of the dependant variables, and shows how well the terms fit a regression line \cite{RMSEMAE}. 
-		\item Mean Bias Error (MBE) - Is the Mean Absolute Error (MAE), which is calculated, if the absolute value is not taken (the signs of the errors are not removed) the MAE becomes the mean biased error. The MBE is intended to measure the average model bias and can convay more useful information that the MAE, but should be interpeted with caution due to the positive and negative error cancelling out. \cite{MBE}
+		\item Adjusted $R^2$ statistic - Shows how well the selected independant variables of the model explain the variability of the dependant variables, and shows how well the terms fit a regression line \cite{RMSEMAE}. 
+		\item Mean Bias Error (MBE) - Is the Mean Absolute Error (MAE), but is calculated if the absolute value isn't taken (the signs of the errors are not removed) the MAE becomes the mean biased error. The MBE is intended to measure the average model bias and can convey more useful information that the MAE, but should be interpreted with caution due to the positive and negative error cancelling out. \cite{MBE}
 	\end{itemize}
 	By calculating these metrics could aid in distinguishing a correlation between the models based on metrics rather than on visual analysis.
 	
@ -1812,7 +1812,7 @@ def create_sets(self, data, lookback, sentiment):
 	
 	It has taught me how the classical (multinomial) naive Bayes probability model can be used for classification for spam or ham (wanted) data, and how the underlying maths and algorithm works - due to hand-coding the algorithm from scratch. Through the use of both the Bag Of Words algorithm for term-frequency identification: how it builds upon the base probability model of the Bayes algorithm. How TF-IDF (Term Frequency-Inverse Document Frequency) further builds upon this to both identify the amount of occurrence of words in a given text by assigning a weight, but also how it uses this to identify commonly used words that are of no relevance to classification. Further how the Addictive Smoothing method aids in dealing with words that have not been identified during training, due to not being initially presented in the training data.
 	
-	Development of this project has given me a further understanding of time management and priorities, in the sense of what needs to focused on during development prior to other features being coded or implemented. An excellent example of where priorities had changed can be seen from the original PID form, \textit{Appendix B}, in which the solution to this project changed from focusing on the front-end application to focusing on the back-end system. This was due to a few factors that have already been identified in the Solution approach section. Where both stakeholders, I the developer and the supervisor of the project concluded that the creation of a front-end application, with a basic back-end for predictions, would not be a satisfactory solution, and more focus should be invested into the price predictions of Bitcoin. Another point where time management had to be considered was when implementing the Naive Bayes Classifier for spam filtering. Time management was not considered during its development and saw the feature go out-of-scope for what was initially wanted - the initial idea was to use the scikit-learns in-built Multinomial Naive Bayes classifier for spam classification. However, tutorials were found on top of the papers used for describing the algorithm during the literature review, which further described how to implement such an algorithm from scratch. Thus this was undertaken and leading to arguably time wasted coded the classification model rather than spending more time on coding the neural network. Arguably in a sense, as detailed in the previous paragraph, it taught me a great deal of how the algorithm works and it's limitations.
+	Development of this project has given me a further understanding of time management and priorities, in the sense of what needs to focused on during development prior to other features being coded or implemented. An excellent example of where priorities had changed can be seen from the original PID form, \textit{Appendix B}, in which the solution to this project changed from focusing on the front-end application to focusing on the back-end system. This was due to a few factors that have already been identified in the Solution approach section. Where both stakeholders, I the developer and the supervisor of the project concluded that the creation of a front-end application, with a basic back-end for predictions, would not be a satisfactory solution, and more focus should be invested into the price predictions of Bitcoin. Another point where time management had to be considered was when implementing the Naive Bayes Classifier for spam filtering. Time management was not considered during its development and saw the feature go out-of-scope for what was initially wanted - the initial idea was to use the scikit-learns in-built Multinomial Naive Bayes classifier for spam classification. However, tutorials were found on top of the papers used for describing the algorithm during the literature review, which further described how to implement such an algorithm from scratch. Thus this was undertaken and leading to arguably time wasted coded the classification model rather than spending more time on coding the neural network. Arguably in a sense, as detailed in the previous paragraph, it taught me a great deal of how the algorithm works, and it's limitations.
 	
 	Furthermore, it has allowed me to form a better knowledge base and understanding of the Python language and data mining techniques used for the languages, to manipulate and used data for a required purpose, and has taught me relevant useful performance metrics for identifying the performance of a neural network and what the results of the metrics represent.
 	
@ -1825,16 +1825,26 @@ def create_sets(self, data, lookback, sentiment):
 	
 	\section{Conclusion and Future Improvements}
 		\subsection{Conclusion}
-		What was aimed for?
+		As stated, the projects focus and solution changed considerably from the original Project Initiation Document which stated in section 2.2 of the document that, the main objective was to "produce a thin web client that provides a dashboard, that provides tangible and useful information to users such as; current price" - \textit{of a cryptocurrency} - "exchange rate, network hashrates and historical price data". "It will also display statistics about sentiment analysis conducted on social media about the currency" with "graphical predictions on what the price may be, in a given time". As from the extracts, the initial objectives of the project were broad and also vague. As this suggests that the focus of the project will be that of creating a thin client dashboard which was no longer the focus of the project, and \textit{"about the cryptocurrency"}, indicating sentiment analysis and price predictions would be conducted on multiple currencies which were an extremely board estimations of workload. It does, however, show that the initial thought process of performing sentiment analysis and price predictions occurred during initial stages of this project, but ultimately change through development to focus on how said sentiment of a cryptocurrency, Bitcoin, extracted from social media could be used in aiding in the prediction of future price.
 		
-		What was produced?
+		As for reference to the projects problem statement and solution approach, "This project will focus on the investigation of these technologies and tools", (sentiment analysers, machine learning algorithms and neural networks), "to justify whether it is feasible to predict the price of BTC based on historical price and the sentiment gathered from Twitter". The solution outlined in the solution approach stated that the solution is "to create a system mainly consisting of; a frontend application that will display plotted; predicted and true, performance metric data to the user as a clear and concise form". A back-end prediction system of "various subsystems responsible for data collection, filtering, data pre-processing, sentiment analysis, network training, validation and training, and future price predictions".
+		
+		The end result followed suitably with would was outlined in both the problem statement and solution approach, and meet all but one point in the technical specification previously discussed in the reflection - where there were issues with deploying the back-end responsible for price forecasting to the external cloud server and getting it operational. A front-end application was also created, although basic, served its purpose of presenting the required data to users in a clear format. The back-end also suitably performed all the tasks set out for it such as; data collection from the Twitter API, sentiment analysis using VADER, and predicting the next hour price of Bitcoin.
+		
+		The majority of the focus was invested into implementing the back-end prediction solution, although fully implemented as intended some time was wasted hand-coding from scratch the Naive Bayes classifier. This did provide valuable information on exactly how the algorithm worked, its limitations and how additional techniques and methods overcome these but was ultimately wasted time due to an already used Python package, Scikit-Learn, having multiple in-built Naive Bayes models to choose from. If used, would have reduced time coding which could have been used elsewhere on the project. For example during testing, to implement K-fold Cross-validation, the $R^2$ statistic or produce the Mean Bias Error, and in turn possibly aid in identifying a correlation from the metrics between the two models, with and without sentiment, rather than relying on the metrics used which did not show much and visual inspection of prediction results.
+		
+		To conclude, the system that has been developed to meet the requirements set out for this project has been developed to the highest possible standard in the time frame that was given for this project. Moreover, from the discussion of results, it can be believed that the system works well and predicts the next hour price of Bitcoin appropriately given the data provided. The user interface provides the necessary information, although not pretty, to the possible stakeholders in a clear and concise which is what was intended for the interface.
 		
-		Interesting what would a days prediction would show due to sentiment not directly affecting the next hour price
 		
 		\subsection{Future Improvements}
+		
+		
+		
+		
+		
 		as such comparing recurrent neural network models, implementation and the affect of regularisation techniques, use of different optimisers would have on the network, how use of ngrams could be used to improve language detection, comparing hand-coded naive bayes model to that of scikit-learns in built classifiers, alterations and additions to the VADER lexicon to tailor it with domain specific language and relevant weighted sentiment values
 		
-		
+		Interesting what would a days prediction would show due to sentiment not directly affecting the next hour price
 		
 		Shifting the predicted data by and hour and sequencing over previous data - will also allow proper use of look-back windows
 		
@ -1860,6 +1870,8 @@ def create_sets(self, data, lookback, sentiment):
 		How would this work what will it show or validate?
 		
 		How would changing epoch and batch size affect performance?
+		
+		Use a different charting system as the plotted lines seem to jump around and skew accuracy
 	\newpage
 	
 	\nocite{*}
--- a/document.toc
+++ b/document.toc
@ -168,10 +168,10 @@
 \defcounter {refsection}{0}\relax 
 \contentsline {subsection}{\numberline {14.1}Conclusion}{89}{subsection.14.1}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsection}{\numberline {14.2}Future Improvements}{89}{subsection.14.2}
+\contentsline {subsection}{\numberline {14.2}Future Improvements}{90}{subsection.14.2}
 \defcounter {refsection}{0}\relax 
-\contentsline {section}{\numberline {15}Appendices}{95}{section.15}
+\contentsline {section}{\numberline {15}Appendices}{97}{section.15}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsection}{\numberline {15.1}Appendix A - Project Initiation Document}{95}{subsection.15.1}
+\contentsline {subsection}{\numberline {15.1}Appendix A - Project Initiation Document}{97}{subsection.15.1}
 \defcounter {refsection}{0}\relax 
-\contentsline {subsection}{\numberline {15.2}Appendix B - Log book}{108}{subsection.15.2}
+\contentsline {subsection}{\numberline {15.2}Appendix B - Log book}{110}{subsection.15.2}