I’m absolutely infatuated with learning. Each day, I love investing the time to discover more about the information that’s available to me. Working within an industry that evolves at such rapid pace, it’s a bare necessity to understand new concepts and practices in order to remain relevant.
Last year, I attended an Interactive Minds meet up, where one of the guest speakers said something that really resonated with me;
“The more you know, the more you know you don’t know”.
After piecing together this puzzle of a sentence, I was able to understand the real value that comes with the process of learning. Learning truly can become addictive!
Over the past year, I’ve been blessed with the opportunity to learn more than I could have ever imagined. By not only working alongside the talented team at Max Kelsen, but also meeting some incredible individuals, I’ve been exposed to some truly fascinating concepts.
Each time I’d discover something that I couldn’t apprehend, I’d take the time to source an explanation. I was also sure to record this newly found information in a document for future reference. After a year of following this procedure, my notes have accumulated into quite a large personal encyclopaedia.
My annotations ranged from a variety of terms relating to machine learning, programming languages, marketing terms, cloud computing, or industry acronyms. In every instance, my original notes were written in an easy-to-consume perspective that would help me better understand the core concept.
As some of the concepts can be quite complex, I wanted to share my annotations to help educate anyone who is currently working within, or is new to the tech industry.
Artificial Intelligence & Machine Learning:
Pytorch: Open-source machine learning library used for Python.
Keras: Open-source neural network library written in Python.
Autoencoder: Neural network with the same input and outputs that generates images/content.
Evolution Strategy (Machine Learning): Similar results to reinforcement learning, but parameters are stacked horizontally and trained together before passing into the next layer.
K-Fold Cross-Validation: The process of optimising the training time versus the testing time of an algorithm. (The approach splits the training process into 10 bins instead of just one, allowing you to allocate any number of bins to training or testing).
Spike Neural Networks (SNN): Third generation of neural network models designed to behave in a more humanistic approach – where neural spikes occur when thinking.
CapsNet: A neural network that encapsulates layers of a CNN. It processes components of an image in relation to each other. Can more efficiently identify the context and content of an image e.g. if an image is upside down.
Attentive/Attention/Spatial Network: A neural network that identifies select areas of an image at a time in high resolution, where the remaining areas are in low-resolution. This is similar to how humans identify objects in sight.
Knowledge Distillation: The process of training a larger NN from a dataset to find the mean result, in which a smaller NN is then created to train off these findings. I.e. learning from learning.
SSD – Single Shot Detector: A neural network that classifies the exact area of an object within an image to rapidly identify the main feature. It’s a more streamlined process than a traditional CNN, as it doesn’t require the model to process unused image real estate.
Caffe2: An open-source deep learning framework library.
ONNX: An open-source neural network library for multiple platforms.
MXNet: A modern-day open-sourced deep learning framework for training and deploying neural networks.
Backpropagation: The process of calculating gradient changes within a deep learning framework, to then identify how the gradient can be optimised/changed to improved network performance.
U-Net: Scaling down high-resolution images to classify image features on a smaller scale. This can then help train a network to classify lower resolution, small images.
Directed Acyclic Graph (DAG): The process of compounding data models onto one and other before passing them on to the next input. Useful for adding historical data from partitioning.
Siamese Neural Network: A convolution network that inputs two different images into 2 identical side-by-side networks. The networks work together to classify the differences between both input images.
Transfer learning: The process of transferring knowledge gained from one models solution, over into another model to form a new baseline solution.
Shallow learning: Similar to deep learning, however is based on memorising content, not understanding its context.
WaveNet: DeepMinds model for generating raw audio from scratch – voice and music.
Google MobileNets: Image recognition ran locally/on the edge for mobile devices.
Impact encoding: A ML training method where you label data, then randomly split it into different buckets and find the mean result for each bucket. You can then compare the means across all of the buckets (finding an average of the average). This process is repeated to ensure all data is tested across itself.
Low-shot learning: ImageNet that is able to learn from a small training set. It reviews its original learning process to retrain itself and increase accuracy – (learning from learning).
Asynchronous SGD: A ML training model that splits weights onto different parametre servers (VM’s), allowing them to train separately. Once an individual weight has been trained, it advances forward, meaning its old weight will become stale/obsolete. All weights can be at different stages at different times.
PixelCNN: An ML model that recreates images one pixel at a time. It replaces one pixel with a new layer of image in the exact position (unlike GAN’s which can move objects in recreated images).
CIFAR-10: A dataset of images commonly used for training computer vision models.
Data Science & Cloud Services:
Amazon Kinesis: Records large streams of video data in real-time.
Amazon Athena: Allows you to query s3 data using standard SQL.
Microservices: A variant of service-oriented architectures that decompose an application into smaller services.
ELK stack: Elasticsearch, Logstash, and Kibana – A web interface used for storing and parsing logs that can be easily searched at a later point.
Apache Kafka: Apache platform that is written in both Scala and Java for handling real-time data feeds.
Apache Hive: Data warehouse built on top of Hadoop for providing data summarisation, query and analysis. Has an SQL-like interface for querying files that integrate with Hadoop.
Apache Solr: An advanced search and indexing platform that can be used for creating a dynamic UI and content recommendations for each individual users.
BigQuery: Tool that enables interactive analysis of large Google storages.
Redis: An open-source database that structures data. Queries are often fast due to the database partitioning/caching data.
Amazon Lightsail: A platform for launching virtual machines.
Amazon CloudWatch: Platform used for collecting and tracking metrics in AWS resources. Also used for monitoring health of cloud services.
Data partitioning: Storing historical data in segments so it’s easier to query at a later date.
Nested query: A subquery within a query. E.g. If X = ____, than perform ____.
Graph database: A database formatted in tree-like graphs where everything connects through nodes.
Amazon SageMaker: Amazon’s platform for building and deploying machine learning models.
Amazon Fargate: A serverless platform for running elastic cloud and kubernetes clusters.
AWS Greengrass: Platform for creating programs/models in the cloud and then deploying them locally. Used for training models used in IoT devices.
Data-provenance: The process of tracing and recording the origins of data and its movement between databases.
Homomorphic Encryption: The process of encrypting identifiable data into numbers or strings so that the behaviours are identifiable, but the user remains anonymous.
NoSQL: Unrelational database for querying unstructured data.
AWS AppSync: A service that allows an app to cache offline events and upload them when back online.
Helm: The tool used to control Kubernetes applications. Built for packaging Kubernetes events so they can be transferred into other applications.
SAP Hana: A column-oriented relational database for storing and retrieving data as requested by applications.
Cron: A software utility that enables you to schedule work functions. Used when scheduling container deployments.
Solidity: A programming language for building smart contracts i.e. blockchain ledgers.
Next.JS: A framework for rendering server-side react apps.
Ruby: Object recognition programming language.
Unicode: A universal encoding standard where each character is assigned a numeric value that can be translated across different platforms and languages.
React suspense: React framework to suspend the display of content blocks/UI components until they have all finished loading – Displays the complete UI at once.
I’m a twenty-two year old Digital Marketing & Conversions Specialist based in Brisbane, Australia. With a passion for all things digital and tech, I aim to connect and learn from as many like-minded digital enthusiasts as possible.