Bitcoin Price Prediction Based on Linear Regression and LSTM

Forecasting can be used in many fields such as crypto currency prediction, financial entities, supermarkets etc. We get the time series date which we use to feed the data into the algorithm is given by Y finance with this we get refreshed data every day. The stock market prediction or forecasting helps customers and brokers get a brief view of how the market behaves for the coming years. Many models are currently in use Like Regression techniques, Long Short-Term Memory algorithm etc. FB Prophet is proven to perform better than most other Algorithms with better accuracy. From the proposed research and references we have determined Facebook's Prophet algorithm as our forecasting algorithm because it is predicting at better accuracy, low error rate, handles messy data, doesn’t bother for null values and better fitting.


INTRODUCTION
Bitcoin is a digital crypto currency that operates on an online decentralized network; it can be traded using an online peer-to-peer Bitcoin network that is not reliant on a central bank or a single administrator.Because it is accepted in over 40 countries worldwide (including Germany, Canada, and Croatia), the emergence of new alternative coins has resulted from its growing popularity.Bitcoin is also used to exchange other crypto currencies, products, and services.Since the introduction of this crypto currency in the year 2009, no hacker has been able to infiltrate it due to block chain technology, where each electronic coin is encrypted with a unique digital signature which makes it easier to track and can be trusted.Each owner signs a digital hash from the previous transaction and adding the public key of the next owner before passing it.The price of Bitcoin in January 2017 was 1,000USD and by the end of December 2017, its value went up to16000 USD and its value as on July 2021 is 32818 Open Access Full Text Article 123(2022) 87-95 DOI: 10.26524/sajet.2022.12.43 • Seaborn: effective visualization for better understanding the graphs and charts.
• Scikit-Learn: implementation of various algorithms.It contains a large variety of supervise and unsupervised learning algorithms.
• Tkinter: To create a faster and quicker GUI application, with cooler features including the implementation of CSS support.
• Pickle: serialization and de-serialization of python object structure to store it in a file/database, maintaining program state, and transfer of data over a network.

RELATED WORK
The prediction of crypto coins using the SVM and SVM-PSO method is suggested, where they

OBJECTIVE
The objective of this proposed system is to develop an application which will predict the bitcoin prices in future with decent accuracy.This allows the investors to invest wisely in bitcoin trading as the prices of bitcoin have gone up to an exaggerating amount in the last ten years.
Thus, the main objectives of the "Bitcoin Price Prediction" can be stated as follows: 1. Develop an application which will predict the bitcoin prices in future with decent accuracy.
2. Allow the investors to invest wisely in bitcoin trading as the prices of bitcoin have gone up to an exaggerating amount in the last ten years.
3. Make use of machine learning algorithms to increase the accuracy of Bitcoin price prediction.

LIMITATION
Although crypto trading has become a new trend, the increase in the number of digital coins and the adaptation of block chain technology causes the biggest concern i.e., scalability.It is still dwarfed by the number of transactions that, VISA, processes each day.Additional to that is the speed of transaction which the crypto market cannot compete with the players like VISA and MasterCard until the infrastructure delivering these technologies is massively scaled.The crypto market is very volatile and can never be predicted at 100 percent accuracy.The market depends on human sentiment too; you may never know when a person owning at least 100 Bitcoin can suddenly sell his entire asset and create a big dip in the crypto market.We can never predict a human emotion even with the advanced technology we Blockchain.Blockchain, which is encrypted with trade information on the public or private network, is a diversified ledger shared with relevant network participants.

Disadvantages:
1.In existing system Block Chain Technology is used to predict bitcoin price.
2. By using Blockchain Technology the prices may not be constant they may vary day to day.

3.
By using Blockchain Technology we cannot predict the future prices.

PROPOSED SYSTEM
The proposed model is used to predict bitcoin price using Machine learning and Neural Network.Machine learning uses Linear Regression and Neural Network uses LSTM for predicting bitcoin prices.In Data Segregation we use features like Open, Close, High, low, Volume BST, Volume UDST, Time, Symbol Linear Regression accuracy rate is 99.87% whereas, LSTM accuracy rate is 97.56%.It is discovered that the Linear Regression model accuracy rate is very high when compared to other models.In this study, we have used data sets for Bitcoin for testing and training the ML and AI model.With the help of python libraries, the data filtration process was done.Python has provided with a best feature for data analysis and visualization.After the understanding of the data, we trim the data and use the features or attributes best suited for the model.Implementation of the model is done and the result is recorded.It was discovered that the linear regression model's accuracy rate is very high when compared to other Machine Learning models from related works; it was found to be 99.87 percent accurate.The LSTM model, on the other hand, shows a mini error rate of 0.08 percent.This, in turn, demonstrates that the neural network model is more optimized than the machine learning model.

Data collection:
Data Collection is the first step we take in order to start any project.It is defined as the procedure of collecting, measuring, and analyzing accurate insights for research using standard validation techniques.An analyst would then be able to assess their theory dependent on gathered information.By and large, information assortment is the essential and most significant advance forresearch, independent of the field of examination.The methodology of information assortment is diverse for various fields of study, contingent upon the necessary data.The most important objective of data collection is ensuring that the gathered information is rich in content and reliable for statistical analysis so that data-driven decisions can be made efficiently and effectively.The data set contains day transactions from 29th August 2017 to 9th August 2020.The data is first tested out with certain regression techniques and then a deep learning model is implemented to provide better accuracy compared to machine learning concepts when there is high or more data sets.

Feature Selection:
Now that we have the required data for the project, we need to start the next procedure called data segregation or feature selection.This is a process where we trim out the unwanted data or we remove the unnecessary data from the data set.This step is necessary as we require only those features which can contribute to our prediction as unnecessary data can cause noise in our final output.To put it in simple words, we segregate data so that we can have a better model which provides us with an optimized result, reduce the property of over-fitting or redundancy and reduce the training time so that the system can generate output faster and with higher accuracy.In this project, I have implemented a few predefined python libraries which help in data visualization and can help you understand the important features which are required by the system.Data visualization is a technique where data or information is represented in a diagrammatic format for better understanding.Data visualization helps us to communicate with the relationships of data using the help of images.These images are in form of patterns that can be understood very easily.This is one of the main reasons how machine learning helps in analyzing data.Whether you work in the finance department or marketing or technical or design, you need to visualize data to understand it.This makes data visualization an important factor in today's world.

LINEAR REGRESSION
This technique is used to identify the relationship between dependent and independent variables and is leveraged to predict future outcomes.When we use only one dependent and one independent variable then it is called the simple linear regression.As the number of independent and dependent variable increase, it is then referred to as multi-linear regression.The graph is plotted using a straight line across the graph which seeks to be the best fit by calculating the method of least square.

LONG SHORT-TERM MEMORY (LSTM)
It is a deep learning concept or particularly a Recurrent Neural Network concept that avoids the vanishing gradient problem.The main reason for using this algorithm is that it avoids the back propagation error from vanishing or exploding, instead, these errors can flow backward through an unlimited number of virtual layers unfolded in space.LSTM mainly works on time series graphs with data sets that consist of events that occur thousands or millions of discrete-time steps earlier.It works with given long delays between significant events and can also handle signals with a mixture of low and high-frequency components.Over a lot of researchers have used LSTM to predict time series related data sets for stock prediction and have achieved greater or higher accuracy compared to othery = prediction x = true value y = prediction x = true value n = total number of data points

RESULT AND DISCUSSION
After the data analysis process, we find that the only four features were well suited for the testing of this project.The data was trimmed and only the selected features were left as shown in Figure 4.
have in hand.The analysis of any technical chart composes of mainly 3 major topics, the trend and momentum which indicate the direction and strength of direction, support, and resistance which indicates the potential stopping points of those directions, and the pattern in general, which indicates the information about the market psychology.Cryptocurrencies have not been around for long enough to provide sufficient information regarding the resistance and key support compared to the stock market, currencies, and commodities.This makes it difficult to predict and practice.EXISTING SYSTEM A Cryptologic pioneer, David Chaum, devised Blind Signature technology, which telecommunicates the encoded messages sealing digital signature, and resulted in inventing Ecash.That is the primary commercial cryptocurrency.Bit 29 Coin in 2009 was the new cryptocurrency.cryptocurrencies have been improved with Block Chain based.Ethereum emerged as the developed money which has services and applications in addition to Block Chain system in 2015.WEF (World Economic Forum) suggested that the ranking of Blockchain must be the fourth of 12 future technologies in the Global Risks Report.Furthermore, in 10 years, 10 percent of GDP all over the world is expected to be based on Blockchain technology.In April 2019, about 40 major banks around the world announced that they would experiment CBDC (central bank digital currencies) founded on

Fig 1 .
Fig 1. Display of the Data collected

Fig 2 .
Fig 2. The Features represented in the data

Fig 3 .
Fig 3. Correlation graph between the features.
squared error n = number of data points  = predicted values  ̂ = predicted values

Fig 4 .
Fig 4.Attribute/Features selected are Open, High, Low, and CloseWe can see the output of two models, one which is the Machine Learning model i.e., Linear regression, and the other one is the Recurrent Neural Network model i.e., Long Short-Term Model which shows us the two different outcomes.Linear regression tends to work based on the Mean Squared Equation which tells us the accuracy of the linear graph with respect to the continuous-time frame data set.We see that the accuracy of the training data is approximately 99.97% and the accuracy of the testing data is tending to be approximately 99.97% as shown in Figure5.Meanwhile, the LSTM model tends to find the accuracy with respect to the Mean Absolute Error which shows the error rate approximately to be 0.08% as shown in Figure6.

Fig 5 .
Fig 5. Accuracy obtains from the training and testing data set using Linear regression model used the day trading method to predict the values of ETH, BTC, XEM, XRP, XLM, LTC.SVM-PSO shows the optimized results.Performance accuracy of different Classifiers differs from coin to coin.However, this paper works only with a machine learning algorithm, and hence the data can be further improved by implementing the Deep Learning concept.The prediction of Bitcoin price using a transaction graph is proposed.The experiment consists of the Baseline, Logistic Regression, SVM, and Neural Network model with an accuracy of 53.4%, 54.3%, 53.7 and 55.1%.The feature selection in this paper is based on the Bitcoin block chain network which tends to be the least informative feature for the prediction of the