![]() ![]() We train the classifiers for 250 iterations. Since X_train has shape $4\times 120$, the gradient needs to have the same size. grad = gradient(() -> L(X_train, y_train), params(X_train)) This can be achieved by changing the second parameters of the gradient function. In some applications, we may need to differentiate with respect to other parameters such as X_train. ![]() The L function needs to be evaluated at the correct points X_train and y_train. The first one is the function we want to differentiate, and the second one are the parameters. ![]() └ Flux ~/.julia/packages/Flux/u7QSl/src/layers/stateless.jl:60 │ layer = Dense(4 => 5, relu) # 25 parameters │ The input will be converted, but any earlier layers may be very slow. Grad = gradient(() -> L(X_train, y_train), ps) ┌ Warning: Layer with Float32 parameters got Float64 input. Flux again provides a smart way to compute it. Since we have the model and the loss function, the only remaining thing is the gradient. We recall that machine learning minimizes the discrepancy between the predictions $\operatorname(\hat y,y)$. X_train, y_train, X_test, y_test, classes = prepare_data(X', y dims=2) Creating the network We set the seed and load the data in the same way as during the last lecture. We include the auxiliary functions from the previous lesson into the utilities.jl file, which we include by include("utilities.jl") This part will present the basics of Flux on the Iris dataset from the previous lecture.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |