Get-ChildItem vs Dir in PowerShell
Batch as usual
Recently I was in the need of modifying a huge amount of
files across a network share. After installing Serviio media streaming service
I noticed that it will crash "randomly". After checking the logs it
was clear that ffmpeg was crashing when trying to open subtitle files that were
not in the encoding indicated by the Serviio console. I needed a quick way to
update all subtitle files to a common encoding, so I decided to convert all
files to UTF8.
Since I use a windows environment I went directly to
PowerShell and wrote this.
Get-ChildItem -path
.\ -filter *.srt -file | ForEach-Object
{ (get-content $_.FullName) | Out-File $_.FullName -encoding utf8 }
This worked almost right except for files with "["
or "]" (among other symbols) on the name or path. To solve it, just
added the "-LiteralPath" switch to tell powershell not to consider
any wildcards on the path name and just use it exactly as it is.
Get-ChildItem -path
.\ -filter *.srt -file | ForEach-Object
{ (get-content -LiteralPath $_.FullName) | Out-File -LiteralPath $_.FullName
-encoding utf8 }
And done! All subtitles files in my media server are now in
UTF-8 and Serviio will work without crashing.
Too slow?
Performance however was a concern, I noticed this was a bit
slower that it should, considering a fast network, small size of the subtitle
files (avg<100kb) and how simple the process is. This single-line script has
only 2 parts:
(get the files) then
for each file (convert it)
I started to dig a little bit into Get-ChildItem I found
that there have been complains about its performance for some time, but it is
much better now than in previous versions. Anyway I tried a different way to do
that same first part of getting the files and compared it against Get-ChildItem.
Using "cmd /c" executed "dir /s /b <pattern>" and did some tests, local and over the network. See the image below for an example measuring the search for .exe files in another drive.
Both over the network and locally, the "dir"
version worked faster, of course it grabs less information than Get-ChildItem,
which actually creates an object around the file returned.
For a final test I then changed the original script to:
cmd /c dir /s /b *.srt
| foreach { (Get-Content -LiteralPath $_) | Out-File -LiteralPath $_
-Encoding UTF8 }
It works 100% like the original script and a bit faster. Although since the big chunk of execution time belongs to the conversion of the files encoding, the jump in speed is not that big in this particular case. However, when I need to do search and filtering of files in the terms of thousands I no longer use Get-ChildItem.
Hope this is useful for you all.
Comments