2. Tutorial: Find Your Way
We are now ready to use Exifuse for the first time.
Note
We will do this under the assumption that you do not know the Julia language yet. The idea here is to go so slow that you simply pick up Julia along the way; let’s see if that works out..
The Exifuse project we downloaded before contains a pics
folder; change your directory to it and start Exifuse by running exifuse
(or, on Windows, exifuse.bat
):
$ cd exifuse/pics
$ exifuse
...
pics/::julia>
This will start a Julia REPL. Its prompt will look different from the usual julia>
, and display the current directory name (we will often shorten it here to simply >
). But we are still in Julia and can do math:
> 1+1
2
We can run any Julia function—for example:
> pwd()
"/foo/exifuse/pics"
This will display your current directory, or, more specifically, call a function that returns your current directory as string, which is then immediately displayed by the REPL. You can work with it, i.e., store it in a new variable and do something with it as in:
> s = pwd()
"/foo/exifuse/pics"
> v = splitpath(s)
4-element Vector{String}:
"/"
"foo"
"exifuse"
"pics"
The second function splitpath
is also a Julia function. It splits our path into a list—in Julia parlance: a vector—of path components. And as always, the REPL tries to display the value returned from the function.
If you have such a vector, you can also easily access its elements (Julia indices start at 1, unlike Python’s):
> s = v[1]
"/"
> s = v[end]
"pics"
We now have essentially fully mastered the Julia REPL. Let’s start looking around our file system and..
2.1. Leave!
It’s always good to know how to get out of a new tool (as the occasional vi
editor nightmare jolts suggest). To exit Exifuse’s Julia REPL, just press <Ctrl+D>:
> [press <Ctrl+D>]
$ ls ## <- 'dir' for Windows
pics_flat pics_mess pics_tree
As the ls
(or dir
) command shows, we are back in our plain command line, and our folder contains 3 subfolders. Let’s head back to Exifuse and try the same there.
2.2. Look Around
$ exifuse
pics/::julia> ls()
2-element Vector{AbstractEntry}:
DirEntry("/foo/exifuse/pics/pics_flat", drwxr-xr-x)
DirEntry("/foo/exifuse/pics/pics_tree", drwxr-xr-x)
[ no files ]
[ 2 dirs ( no syml ) ( #paths:2 ) ] :: [ no dev,sock,fifo.. ] :: [ no unknown/broken ]
This calls the Exifuse ls()
function, which returns the current folder’s content as a vector. (The output looks a bit differently from the usual command line ls
, but we can still identify the three directories we saw before.) Let’s see now what’s inside the pics_flat
subfolder:
pics/::julia> ls("pics_flat")
7-element Vector{AbstractEntry}:
FileEntry("/foo/exifuse/pics/pics_flat/e.jpg", -rw-r--r--, 24867 bytes)
FileEntry("/foo/exifuse/pics/pics_flat/e2.gif", -rw-r--r--, 45390 bytes)
FileEntry("/foo/exifuse/pics/pics_flat/f.jpg", -rw-r--r--, 31788 bytes)
FileEntry("/foo/exifuse/pics/pics_flat/i.jpg", -rw-r--r--, 10154 bytes)
FileEntry("/foo/exifuse/pics/pics_flat/s.tiff", -rw-r--r--, 130534 bytes)
FileEntry("/foo/exifuse/pics/pics_flat/u.png", -rw-r--r--, 83761 bytes)
FileEntry("/foo/exifuse/pics/pics_flat/x.jpg", -rw-r--r--, 28068 bytes)
[ 7 files ( none of which symlinked ) -- 346.252 Kb -- 354,562 bytes ( #paths:7 #dev:1 #inodes&dev:7 ) ]
{ :jpeg 4/"jpg" :gif 1/"gif" :tiff 1/"tiff" :png 1/"png" }
[ no dirs ] :: [ no dev,sock,fifo.. ] :: [ no unknown/broken ]
Unsurprisingly, we now see the contents of the pics_flat
subfolder. In addition, at the bottom, a small summary description gives you the most important summary stats (more details on those later).
The crucial thing to keep in mind is: we do not only see the file system entries, like on the command line, but we can remember and operate on them. Witness:
> F = ls("pics_flat")
...
> length(F)
7
> filesize(F)
354562
> filesizehuman(F)
"346.252 Kb"
(There are now, for example, also commands that would simply delete all those files; but let’s keep that for later before we nuke your photo library by accident.)
2.3. Look Deeper
Above, we had a look at the pics_flat
example folder, which just contained a few photos. More realistically, you want to be able to operate on whole folder hierarchies with a large number of photos inside. There is another subfolder called pics_tree
that has several levels of subfolders. Let’s check it out—first with the ls
command, and then with find
:
> ls("pics_tree")
3-element Vector{AbstractEntry}:
DirEntry("/foo/exifuse/pics/pics_tree/bar", drwxr-xr-x)
DirEntry("/foo/exifuse/pics/pics_tree/baz", drwxr-xr-x)
DirEntry("/foo/exifuse/pics/pics_tree/foo", drwxr-xr-x)
[ no files ]
[ 3 dirs ( no syml ) ( #paths:3 ) ] :: [ no dev,sock,fifo.. ] :: [ no unknown/broken ]
> find("pics_tree")
12-element Vector{AbstractEntry}:
DirEntry("/foo/exifuse/pics/pics_tree", drwxr-xr-x)
DirEntry("/foo/exifuse/pics/pics_tree/bar", drwxr-xr-x)
FileEntry("/foo/exifuse/pics/pics_tree/bar/s.tiff", -rw-r--r--, 130534 bytes)
FileEntry("/foo/exifuse/pics/pics_tree/bar/u.png", -rw-r--r--, 83761 bytes)
FileEntry("/foo/exifuse/pics/pics_tree/bar/x.jpg", -rw-r--r--, 28068 bytes)
DirEntry("/foo/exifuse/pics/pics_tree/baz", drwxr-xr-x)
FileEntry("/foo/exifuse/pics/pics_tree/baz/f.jpg", -rw-r--r--, 31788 bytes)
FileEntry("/foo/exifuse/pics/pics_tree/baz/i.jpg", -rw-r--r--, 10154 bytes)
DirEntry("/foo/exifuse/pics/pics_tree/foo", drwxr-xr-x)
DirEntry("/foo/exifuse/pics/pics_tree/foo/foobar", drwxr-xr-x)
...
[ 7 files ( none of which symlinked ) -- 346.252 Kb -- 354,562 bytes ( #paths:7 #dev:1 #inodes&dev:7 ) ]
{ :jpeg 4/"jpg" :tiff 1/"tiff" :png 1/"png" :gif 1/"gif" }
[ 5 dirs ( no syml ) ( #paths:5 ) ] :: [ no dev,sock,fifo.. ] :: [ no unknown/broken ]
The find
function recursively goes through a folder and returns all file system entries it finds, i.e., directories as well as files.
What if we wanted to determine the number of photos in this folder hierarchy? We cannot simply use:
> E = find("pics_tree")
...
> length(E)
...
Why is this not correct? Well, the vector of file entries also contains all encountered subfolders, i.e., the directory entries. To only get the actual files, we can use our first filter function called getfiles
:
> E = find("pics_tree")
...
> F = getfiles(E)
> length(F)
7
getfiles
filter the vector and only return actual, regular file entries (discarding the directory ones).
In short: we have a full directory’s files collected into a single vector at our immediate disposal now. This directory could have contained 100,000 files, as far as we know it. We will soon learn to easily filter our vector in arbitrary ways—without ever going through the slow initial file operations again. The key is that we have most of the info we need in memory, stored in our own variables. We only lose that precious data or intermediate results when we quit Exifuse. That’s why we lastly need to learn to..
2.4. Amble
Above, we once briefly quit Exifuse to get back to the familiar command line, ready to move around with cd
. While this is one way of navigating the file system, we would like to avoid that, as we would lose all the nice file lists and variables we just built inside Exifuse. So how to move around without exiting Exifuse? Here’s how:
We can use Julia’s
cd
function to change directories, as in:> ls() ... > cd("pics_flat") ... > ls() ...
This is doable, but it feels a bit hard on your finger joints, with all those brackets and quotation marks.
A smoother way to surf the file system is to press the semicolon key
;
, within Exifuse’s Julia REPL:> [press ';' to switch to Julia's bash mode] ... $ cd pics_flat ... $ ls ... $ [press <Delete> to return to Julia's native mode] > ls() ...
What happens here? When pressing
;
, the REPL mode changes to a bash-like one—you can use standard bash commands—no brackets, no quotations marks. Once you return to Julia’s native mode, all you data and variables are again/still at your disposal.
We will soon look at more ways to interact with large sets of files, but before we tackle such use cases, let’s zoom into individual files first..