- D Cookbook
- Adam D. Ruppe
- 991字
- 2021-07-16 11:50:45
Finding the largest files in a directory
Suppose you're out of disk space. A solution may be to delete old, large files from a directory. Let's write a D program to perform this task.
How to do it…
Execute the following steps to find the largest files in a directory:
- Use the
std.file.dirEntries
function to get a listing of all files. - Define the
DirEntry
variable as an array. - Sort the array by size in descending order by using
std.algorithm
and a lambda function. - Filter out the newer files with
std.algorithm.filter
. - Delete the top 10 files with the
std.file.remove
function.
The code is as follows:
void main() { import std.file, std.algorithm, std.datetime, std.range; DirEntry[] allFiles; foreach(DirEntry entry; dirEntries("target_directory", SpanMode.depth)) allFiles ~= entry; auto sorted = sort!((a, b) => a.size > b.size)(allFiles); auto filtered = filter!((a) => Clock.currTime() - a.timeLastModified >> 14.days)(sorted); foreach(file; filtered.take!(10)) remove(file.name); }
How it works…
Phobos provides the std.file
module for high-level operations on files and directories. With it, we can read and write files, list files in a directory, get file information, and perform common operations such as deleting and copying files.
The dirEntries
function returns an object that works with foreach
. Depending on the type you request in the loop, it will provide different information. The foreach(string name; dirEntries())
function gives you just the filenames. The foreach(DirEntry entry; dirEntries())
function gives details.
This is implemented with a function called opApply
. D's foreach
loop understands four kinds of items: a numeric interval, arrays (or slices), input ranges, and objects with a member function called opApply
. These are explained in detail in the following paragraphs.
Numeric intervals are a simple start-to-finish progression of integers, as shown in the following line of code:
foreach(num; 0 .. 10) { /* loops from num = 0 up to, but notincluding, num = 10 */ }
Input ranges are iterable objects that are used throughout much of Phobos. Indeed, the sort
, filter
, and take
functions we use here from std.algorithm
both consume and return input ranges. Ranges will be covered in greater depth later in this book.
While input ranges are useful for a variety of tasks, they aren't ideal for everything. The opApply
function is used for these cases. It is a special member function on a struct or a class that takes a delegate. The arguments to the delegate are the foreach
iteration variable types, and the body of the delegate is automatically set to be the inner code of the loop. The delegate's return value gives flow control, similar to blocks in Ruby.
After gathering the data, we use std.algorithm
to sort, filter, and limit the size of the results. These functions show the power of input ranges and lambda functions. The syntax (a) => a;
is a lambda function. First, there is a parameter list in parentheses. Types are optional here; if excluded, the lambda function is implemented as a template with implicit types from context. Then, the symbol =>
is the key indicator of a lambda function, and finally you have the return value. The short lambda syntax is only usable for a single expression and cannot return void.
The (a) => a
function, in this context, could alternatively be written as a => a
. If it has only one argument, the parentheses are optional. It could also be written as int delegate(int a) { return a; }
, function(int a) { return a; }
, or even (int a) { return a; }
. The delegate
and function
options make two separate but related types. The difference between a delegate and a function is that a delegate has a context pointer whereas a function does not. The context pointer gives the delegate access to variables from the surrounding scope. A function can only access global variables, data through its arguments, and static data. If you do not specify one of the two keywords, delegate is usually the default; the compiler will give you what you require.
With that background information, let's look at the following line of code in more detail:
sort!((a, b) => a.size > b.size)(allFiles);
The std.algorithm.sort
function takes two arguments: an optional comparison function, given as a compile-time argument for maximum efficiency, and a random access range to sort. A random access range is any iterable object from where you can jump around to different indexes. The most common random access range is an array or a slice. This is why we built an array of DirEntry
. Firstly, because the dirEntries
function uses the opApply
iteration, so it isn't a range, and secondly, to sort, we need the whole list ready at once.
The next line is very similar. Again, we use a function from std.algorithm
that takes a range and a function (called a predicate in the std.algorithm
documentation). The filter returns a new range with all the properties of the sorted list except items that don't match the filter requirement. For example, the file that was last modified more than 14 days before the current time is removed.
Let's also look at the syntax of 14.days
. The days
function is a function in the module core.time
with the @property Duration days(int a);
signature. This uses a D feature called Uniform Function Call Syntax (UFCS). With UFCS, a call to foo.bar
may be rewritten as bar(foo)
. This lets us extend any type, including built-in types such as integers and arrays in our code, adding new pseudomembers and properties. When used properly, this gives extensibility, readability, and can even help encapsulation, allowing you to write extension methods outside the original module, thus limiting access to private data.
Finally, we complete our task by using take(10)
(via a UFCS call), which takes the first 10 items off the filtered list, and calling remove
from std.file
to remove (delete) the file.
- Reporting with Visual Studio and Crystal Reports
- 前端跨界開發指南:JavaScript工具庫原理解析與實戰
- Machine Learning with R Cookbook(Second Edition)
- Practical Game Design
- Learning Python Design Patterns(Second Edition)
- HTML5 and CSS3 Transition,Transformation,and Animation
- MySQL數據庫基礎實例教程(微課版)
- The HTML and CSS Workshop
- Getting Started with NativeScript
- Java Web從入門到精通(第3版)
- Instant jQuery Boilerplate for Plugins
- 從Excel到Python數據分析:Pandas、xlwings、openpyxl、Matplotlib的交互與應用
- JQuery風暴:完美用戶體驗
- 計算機應用基礎(第二版)
- Python預測分析與機器學習