In computer science, a tree is a hierarchical data structure used to organize information in a way that makes it easy to find related items. We will create a tree with nodes to store some data.
Before we go on, here's some tips for users of other programming languages this year:
A node represents an element or value within the tree, and each node can have one parent node (its direct ancestor) and zero or more child nodes (its descendants).
The top-most node in the tree is called the root node, and the nodes at the bottom of the tree are called leaf nodes. Trees can be used to represent hierarchical relationships between items, such as a family tree, organizational chart, or file system directory structure.
The file system on your computer is a tree. It starts with the root directory (on Windows, C:\
; on macOS or Linux, /
). Let's recreate the general structure of one.
First, before we even create the tree, let's create the constituent nodes. The nodes of a file system are files and folders. Since there are two types of nodes, let's create struct
s for them.
struct File {
/// The name of the file.
var name: String
}
struct Folder {
/// The name of the folder.
var name: String
/// The files contained in the folder.
var files: [File] = []
}
You will notice that I have created an empty array already for files
. This is so that when I create a Folder
object, I don't have to specify a list of files — although, I could if I wanted to!
Next, let's create a folder and a file. Do not add the file to the folder yet — create the Folder
first.
var folder = Folder(name: "My Folder")
var file = File(name: "README.txt") // Notice no files array needed here.
Next, let's add the file to the folder.
folder.files.append(file)
Simple enough!
Let's give the files and folders the ability to represent their own path. You may be familiar with paths like C:\Users\yourname\Downloads\somefile.exe
from Windows or /home/yourname/Downloads/somefile.flatpak
from Linux.
We will create Linux-style paths where the folder path starts with /
. For example, the path to our file will be /My Folder/README.txt
.
To do this, we will make the struct
s conform to CustomStringConvertible
. Let's start with Folder
.
struct Folder : CustomStringConvertible {
// Existing members of the struct here.
// ...
var description: String {
return "/" + name
}
}
print(folder)
This should print /My Folder
, which is great.
Let's make File
also conform to CustomStringConvertible
:
struct File : CustomStringConvertible {
// Existing members of the struct here.
// ...
var description: String {
return "/" + name
}
}
print(file)
This will return /README.txt
… but that's not sufficient. We want to know the name of the folder that contains the file! The full path should be /My folder/README.txt
.
To make this possible, the child/leaf (file) needs to be aware of its parent (folder) — it can't print its parent folder unless it knows what it is.
Let's add the concept of a parent folder to the File
struct. This can be used to show the name of the parent.
struct File : CustomStringConvertible {
var name: String
var parentFolder: Folder?
var description: String {
guard let parentFolder else {
return name
}
return "\(parentFolder)/\(name)"
}
}
Notice that description
now uses the parent folder's description in itself. The parent nodes will also use their parent folder's description to get their own name, and so on and so forth until you reach the root node which, lacking a parent, simply gives its name and nothing else.
When you combine all of these components together, you create a single, long path starting with the root node and ending with whichever object you printed.
This is an example of recursion, where a method (or in our case, a computed variable; same thing) calls itself until a defined scenario where it will stop doing so.
In this case, description
gets used by each node, starting with the current one and going up through its parents until it reaches the root node. As the root node has no parent, the recursion ends.
Recursion in itself, even not using object-oriented programming, can be counted as a complex programming technique. However, it will not be taught in this unit.
Learn more about recursion in Swift.
As you can see, parentFolder
is optional. That's because the file might not be contained in a folder — it might be the root node! This is why we need to use guard let
to ensure the parent folder exists.
Of course, a file system that can only contain a single file might seem ludicrous to you. However, early file systems in the '70s and '80s had stranger limitations than that!
Let's add a folder to contain the one we already have. This will be our top-level folder, right at the front of the path — the new root node.
Before we do that, we have a problem that needs solving: currently, a Folder
can only contain File
s.
We need to be able to have a general type to treat File
and Folder
the same. Using polymorphism, let's create a protocol that can describe both types called FilesystemNode
.
A file system node should have a name, just like any file or folder. It should also have the concept of a parent node, so that a file can be contained in a folder — but so can another folder!
Since both files and folders will be able to contain parent nodes, let's also move the implementation of CustomStringConvertible
from File
to an extension.
protocol FilesystemNode : CustomStringConvertible {
var name: String { get set }
var parentNode: FilesystemNode? { get set }
}
extension FilesystemNode {
var description: String {
guard let parentNode = parentNode else {
return name
}
return "\(parentFolder)/\(name)"
}
}
Let's adjust the File
and Folder
struct
s accordingly:
struct File : FilesystemNode {
var name: String
var parentNode: FilesystemNode? // Changed from Folder?
}
struct Folder : FilesystemNode {
var name: String
var parentNode: FilesystemNode? // Just like File.
var children: [FilesystemNode] = [] // Changed from var files: [File]
}
With those adjusted, let's create the new root folder. Let's call it "C Drive".:
var root = Folder(name: "C Drive")
var folder = Folder(name: "My Folder")
var file = File(name: "README.txt")
// Add My Folder to the C Folder.
root.children.append(folder)
// Add README.txt to My Folder, inside the C Drive.
folder.children.append(file)
Finally, let's try printing the file
object to see its description:
print(file)
README.txt
Unfortunately, the file
object isn't aware of its parent so it doesn't print anything other than "README.txt
" when printed.
That is to say, its parent nodes are aware of their children but the relationship doesn't happen in reverse. Without establishing that relationship, any given node cannot tell you its full path (including the parent and root nodes).
To facilitate this, we will create a method in the Folder
class to add child nodes (i.e. files and folders). This will make use of a new feature: inout
variables.
struct Folder : FilesystemNode {
// Variables...
// ...
mutating func add(_ child: inout some FilesystemNode) {
child.parentNode = self
children.append(child)
}
}
This new method is mutating
because it will modify the data in the object (i.e. it adds to the children
array); we've seen that before so it should be fairly straight-forward.
What's new is the inout
keyword.
inout
Normally, when you pass an argument to a Swift function, that value is treated as a constant within the method body: it cannot be modified, only used.
However, using inout
tells Swift to make a new, mutable copy of that passed-in value. This allows us to modify the data before using it.
In this case, we are modifying the child.parentNode
value and setting it to self
. This means that the child's parent will become the FilesystemNode
-conforming object whose add(_:)
method was called; in this case, the Folder
.
When you call a method with an inout
parameter, its value must be passed with an &
ampersand prepended (put at the start) of its name. Therefore, adding nodes to folders now looks like this:
// Add My Folder to the C Folder.
root.add(&folder) // Changed from root.children.append
// Add README.txt to My Folder, inside the C Drive.
folder.add(&file) // Changed from folder.children.append
some
What about the some
keyword after inout
?
That is more Swift safety at play. If we leave the some
out, Swift infers that the child
parameter must be any FilesystemNode
; that is any object that conforms to the FilesystemNode
protocol.
However, it means that at any given time the method is run, Swift has to do some runtime checks (while the program is running) to ensure the correct conformance.
As part of being able to do this, Swift will require you to declare the root
, folder
, and file
variables generically as any FilesystemNode
as well. This would look like:
// Note the inserted `any FilesystemNode`
var root: any FilesystemNode = Folder(name: "C Drive")
var folder: any FilesystemNode = Folder(name: "My Folder")
var file: any FilesystemNode = File(name: "README.txt")
The problem here is that since Swift will now only treat these objects generically, there isn't a convenient way to call the Folder
objects' add(_:)
method: as far as Swift is concerned, they're not Folder
s, they're any FilesystemNode
.
The some
keyword in add(_:)
's method signature helps Swift to know that, at any given time that add(_:)
is called, it will only be dealing with one kind of object that conforms to FilesystemNode
.
Therefore, Swift can 'imagine' that this call:
folder.add(file)
is calling a method with this signature:
// Note: File instead of FilesystemNode
mutating func add(_ child: inout File)
instead of:
mutating func add(_ child: inout any FilesystemNode)
Swift, satisfied with this, will not need to do any runtime checks. Thus:
any FilesystemNode
to be able to pass them to the methodFolder
object's add(_:)
method without too much hassleMake sure that you have completed the file system activities above. You should have:
FilesystemNode
protocol, describing what a filesystem node contains (i.e. a name, an optional parent)FilesystemNode
so that it conforms to CustomStringConvertible
Folder
struct conforming to FilesystemNode
, which additionally contains child nodes (children
) and can add items to itself (add(_:)
)File
struct conforming to FilesystemNode
Folder
and File
objectsThe current system works well when working from the nodes down to the root of the tree.
What happens if we want to search through the tree to find a node?
In our current system, if we were to implement a search function for our FilesystemNode
func search(byName name: String) -> FilesystemNode?
In our Folder struct it could look something like this.
func search(byName name: String) -> FilesystemNode? {
if self.name == name {
return self
}
for child in children {
if let foundFileNode = child.search(byName: name) {
return foundFileNode
}
}
return nil
}
And a search function in our File struct
func search(byName name: String) -> FilesystemNode? {
if self.name == name {
return self
} else {
return nil
}
}
If you were to try and run this code on the root folder, it won't find the file
if let foundFileNode = root.search(byName: "README.txt") {
print(foundFileNode)
} else {
print("Couldn't find file")
}
The code would output "Couldn't find file".
This is because when we added our folder
to the root
folder, we did this before we added our file
to our folder
, so the copy of folder
that exists in the children of root
is one without any children.
We could rewrite this so that we add file to folder before adding folder to root.
folder.add(&file)
root.add(&folder)
But if we try print out file we find that now our folder doesn't know that the parent is root. My Folder/README.txt
This is because our structs are treated as value types
where a new copy of the object is made when it is passed into another variable.
So what is the solution?
Classes are reference types
this means when they are passed into another variable, for example a list, they are still the same object in memory.
Class syntax is as follows
class Folder: FileSystemNode {
var name: String
var parentNode: FileSystemNode?
var children: [FileSystemNode]
init(name: String) {
self.name = name
children = []
}
func add(_ child: inout some FileSystemNode) {
child.parentNode = self
children.append(child)
}
}
You may have spotted the new function in the class init()
. Classes are required to have an initialiser to create the object.
You may have noticed that nothing else in this code has changed.
What this means for our code is that we can enter each of the nodes in any order.
Now if we use our search function it should work as intented.
Recreate the following folder structures…
TBD