Dimension Reduction:
Autoencoders
Anastasios Panagiotelis
University of Sydney
1

OutlineNeural Network
2

OutlineNeural NetworkAn entire course could be done on neural networks, I will give only the basic idea.

2

OutlineNeural NetworkAn entire course could be done on neural networks, I will give only the basic idea.

Idea behind Autoencoder
2

OutlineNeural NetworkAn entire course could be done on neural networks, I will give only the basic idea.

Idea behind AutoencoderHas the same input and output.
Is narrow in the middle.

2

Neural Networks3

Autoencoders are Neural NetsMany of you are probably familiar with neural networks.
4

Autoencoders are Neural NetsMany of you are probably familiar with neural networks.
The autoencoder is a neural network with a clever trick for dimension reduction.
4

Autoencoders are Neural NetsMany of you are probably familiar with neural networks.
The autoencoder is a neural network with a clever trick for dimension reduction.
Briefly cover neural nets for those without any background.
4

The meat grinderConsiders inputs xx, a target tt and some function f()f() such that f(x)f(x) is close to the target tt
5

The meat grinder

Considers inputs $x$ , a target $t$ and some function $f ()$ such that $f (x)$ is close to the target $t$
Consider a the analogy of a meat grinder used to make salami.

Inside the grinder

DetailsWhat happens at each node?
First a linear combination is taken of all inputs to a nodeMade up of weights and biases (intercepts) that need to be estimated.

The linear combination is passed through a non linear functionSigmoid function
RELU

7

Training a neural networkConsider a loss function that compares output to targetMean square error
0-1 loss
...

8

Training a neural networkConsider a loss function that compares output to targetMean square error
0-1 loss
...

Find values of weights and biases that minimise the loss function over a training sample.
8

How to do this?Many computational tricks for finding optimal weightsAutomatic differentiation
(Stochastic) Gradient Descent
Use of GPUs

9

How to do this?Many computational tricks for finding optimal weightsAutomatic differentiation
(Stochastic) Gradient Descent
Use of GPUs

Only a different course could do this justice.
9

Trained Neural Net

Auto-encoders11

Stretching the analogyConsider the aim is not to make salami but dimension reduction.
12

Stretching the analogyConsider the aim is not to make salami but dimension reduction.
The idea is to break ingredients down into protons, electrons and neutrons.
12

Stretching the analogyConsider the aim is not to make salami but dimension reduction.
The idea is to break ingredients down into protons, electrons and neutrons.
A neural network can be constructed for this problem.
12

Stretching the analogyConsider the aim is not to make salami but dimension reduction.
The idea is to break ingredients down into protons, electrons and neutrons.
A neural network can be constructed for this problem.
The trick is to use all variables as both input and output.
12

An autoencoder

The key ideaThere is a "skinny" middle layer containing fewer nodes than input (and output) variables.
14

The key ideaThere is a "skinny" middle layer containing fewer nodes than input (and output) variables.
Everything up to this skinny part is called "the encoder"
14

The key ideaThere is a "skinny" middle layer containing fewer nodes than input (and output) variables.
Everything up to this skinny part is called "the encoder"  Think of this as breaking down ingredients into protons, neutrons and electrons.

14

The key ideaThere is a "skinny" middle layer containing fewer nodes than input (and output) variables.
Everything up to this skinny part is called "the encoder"  Think of this as breaking down ingredients into protons, neutrons and electrons.

Everything past this skinny part is the "decoder"
14

The key ideaThere is a "skinny" middle layer containing fewer nodes than input (and output) variables.
Everything up to this skinny part is called "the encoder"  Think of this as breaking down ingredients into protons, neutrons and electrons.

Everything past this skinny part is the "decoder"Think of this as reconstructing the ingredients.

14

And that's it!An autoencoder is simply a neural network that
15

And that's it!An autoencoder is simply a neural network thatHas the same input and output
Get's "skinny" in the middle

15

And that's it!An autoencoder is simply a neural network thatHas the same input and output
Get's "skinny" in the middle

The advantages and disadvantages are the usual advantages and disadvantages of neural networks.
15

AdvantagesVery flexible and non-linear
16

AdvantagesVery flexible and non-linear
Can borrow ideas from neural network literature
16

AdvantagesVery flexible and non-linear
Can borrow ideas from neural network literatureUsing convolutional layers for images
Using recurrent layers for time series

16

AdvantagesVery flexible and non-linear
Can borrow ideas from neural network literatureUsing convolutional layers for images
Using recurrent layers for time series

Plenty of software and efficient packages
16

SoftwareCan be implemented using the dimRed package.
17

SoftwareCan be implemented using the dimRed package.
However this merely wraps around tensorflow
17

SoftwareCan be implemented using the dimRed package.
However this merely wraps around tensorflowNeed to have reticulate and Python properly configured

17

SoftwareCan be implemented using the dimRed package.
However this merely wraps around tensorflowNeed to have reticulate and Python properly configured
Can be a bit time consuming and tricky.

17

SoftwareCan be implemented using the dimRed package.
However this merely wraps around tensorflowNeed to have reticulate and Python properly configured
Can be a bit time consuming and tricky.
For autoencoders probably better to use Python.

17

DisadvantagesLots of tuningHow many layers?
Which activation function?

18

DisadvantagesLots of tuningHow many layers?
Which activation function?

Lots of parametersSparsity

18

DisadvantagesLots of tuningHow many layers?
Which activation function?

Lots of parametersSparsity

Lots of training data often needed
18

Questions?19

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help

Dimension Reduction:Autoencoders

Anastasios Panagiotelis

University of Sydney

Outline

Outline

Outline

Outline

Neural Networks

Autoencoders are Neural Nets

Autoencoders are Neural Nets

Autoencoders are Neural Nets

The meat grinder

The meat grinder

Inside the grinder

Details

Training a neural network

Training a neural network

How to do this?

How to do this?

Trained Neural Net

Auto-encoders

Stretching the analogy

Stretching the analogy

Stretching the analogy

Stretching the analogy

An autoencoder

The key idea

The key idea

The key idea

The key idea

The key idea

And that's it!

And that's it!

And that's it!

Advantages

Advantages

Advantages

Advantages

Software

Software

Software

Software

Software

Disadvantages

Disadvantages

Disadvantages

Questions?

Outline

Help

Dimension Reduction:
Autoencoders