GAURAV KUMAR, Resource Person and Trainer: High Performance Cloud Platforms for Scientific Computing Applications

Now days, the software applications as well as smart devices and gadgets are facing enormous performance issues including load balancing, turnaround time, delay, congestion, big data, parallel computations and many others. These key issues traditionally consume huge computational resources and the low configuration computers are not able to work on high performance tasks. The laptops and computers which are available in market are used as a personal computer and these systems face huge performance issues when the high performance jobs are required to be solved.

For example, a desktop computer or laptop having 3 GHz processor is able to perform approximately 3 billion computations per second. The High Performance Computing (HPC) is having focus towards solving the complex problems and working on quadrillions or trillions of computations with high speed and maximum accuracy.

The high performance computing applications are used for the domains where speed and accuracy level is quite high as compared to traditional scenarios. Because of this reason, the cost factor is very high with the deployment of high performance computing still it is required because of the sensitivity and requirements as per the application domain.

Following are the use cases and scenarios where high performance implementations are required
Nuclear Power Plants
Space Research Organizations
Oil and Gas Explorations
Artificial Intelligence and Knowledge Discovery
Machine Learning and Deep Learning
Financial Services and Digital Forensic
Geographical and Satellite Data Analytics
Bio-Informatics and Molecular Sciences

A number of cloud platforms are available on which the high performance computing applications can be launched without having access to the actual supercomputer. Using these cloud services, the billing is done on the usage basis and it costs less as compared to purchasing the actual infrastructure required for working with high performance computations.

Following are few of the prominent cloud based platforms which can be used for the advanced implementations including data science, data exploration, machine learning, deep learning, artificial intelligence and many others.

Neptune: https://neptune.ml/

Neptune is a lightweight cloud based service for high performance applications including data science, machine learning, predictive knowledge discovery, deep learning, modeling training curves and many others. Neptune can be integrated with Jupyter notebooks so that the Python programs can be easily executed for multiple applications.

The dashboard of Nepture is available at https://ui.neptune.ml/ on which multiple experiments can be done. Neptune works as a machine learning lab on which assorted algorithms can be programmed and their outcomes can be visualized. The platform provides the Software as a Service (SaaS) so that the deployment can be done on cloud. The deployments can be done on own hardware and can be mapped with the Neptune cloud.

In addition to pre-built cloud based platform, Neptune is having integration with Python and R Programming so that high performance applications can be programmed. Python and R are prominent programming environments for the data science, machine learning, deep learning, big data and many other applications.

For Python programming, Neptune provides neptune-client so that the communication with Neptune server can be done and advanced data analytics are implementable on its advanced cloud.

For integration of Nepture with R, there is an amazing and effective library "reticulate" which integrates the use of neptune-client.

The detailed documentation for integration of R and Python with Neptune are available at https://docs.neptune.ml/python-api.html and https://docs.neptune.ml/r-support.html

In addition, the integrations with MLflow and TensorBoard are available. MLflow refers to the open source platform for managing the machine learning lifecycle with the reproducibility, advanced experiments and deployments. It is having three key components including Tracking, Projects and Models. These can be programmed and controlled using Neptune MLflow integration.

The association of TensorFlow with Neptune is possible using Neptune-TensorBoard. Tensorflow is one of the powerful frameworks for the deep learning and advanced knowledge discovery approaches.

With the usage of assorted features and dimensions, the Neptune cloud can be used for the high performance research based implementations.

BigML: https://bigml.com/

BigML is a cloud based platform for the implementation of advanced algorithms with the assorted datasets. This cloud based platform is having the panel for implementation of multiple machine learning algorithms with ease.

The dashboard of BigML is having access to different datasets and algorithms under supervised and unsupervised taxonomy as shown in Figure 4. The researcher can used the algorithm from the menu as per the requirements of the research domain.

A number of tools, libraries and repositories are integrated with BigML so that the programming, collaboration and reporting can be done with higher degree of performance and minimum error levels.

The algorithms and techniques can be attached with the specific dataset for evaluation and deep analytics as shown in Figure 5. With the methodology, the researcher can work with the code as well as dataset on easier platform.

Following are the Tools and Libraries which are associated with BigML for multiple applications of high performance computing
Node-Red for Flow Diagrams
Github Repos
BigMLer as Command Line Tool
Alexa Voice Service
Zapier for Machine Learning Workflows
Google Sheets
Amazon EC2 Image PredictServer
BigMLX App for MacOS

Google Colaboratory: https://colab.research.google.com

Google Colaboratory is one of the cloud platforms for implementation of high performance computing tasks including Artificial Intelligence, Machine Learning, Deep Learning and many others. It is a cloud based service which integrates Jupyter Notebook so that Python code can be executed as per the application domain.

Google Colaboratory is available as Google App in the Google Cloud Services. It can be invoked from Google Drive as depicted in Figure 6 or directly with the URL https://colab.research.google.com.

The Jupyter notebook in Google Colaboratory is associated with CPU by default. If the hardware accelerator is required like Tensor Processing Unit (TPU) or Graphics Processing Unit (GPU), it can be activated from Notebook Settings

The dataset can be placed in Google Drive. The dataset under analysis is mapped with the code so that the script can directly perform the operations as programmed in the code. The outputs and logs are presented on the Jupyter notebook in the platform of Google Colaboratory.

Deep Cognition
URL: https://deepcognition.ai/

Deep Cognition provides the platform for implementation of advanced neural networks and deep learning models. The AutoML with Deep Cognition provides the autonomous Integrated Development Environment (IDE) so that the coding, testing and debugging of advanced models can be done.

It is having Visual Editor so that the multiple layers of different types can be programmed. The layers which can be imported are Core Layers, Hidden Layers, Convolutional Layers, Recurrent Layer, Pooling Layers and many others.

The platform provides the features to work with advanced frameworks and libraries of MXNet and TensorFlow for scientific computations and deep neural networks.

The research scholars, academicians and practitioners can work on the advanced algorithms and their implementations using cloud based platforms dedicated for high performance computations. With this type of implementation, there is no need to purchase the specific infrastructure or devices rather the supercomputing environment can be hired on cloud.

GAURAV KUMAR, Resource Person and Trainer

Pages

Sunday, October 27, 2019

High Performance Cloud Platforms for Scientific Computing Applications

No comments:

Post a Comment