Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I need to read the contents of several thousands of small files at startup. On linux, just using fopen and reading is very fast. On Windows, this happens very slowly.

I have switched to using Overlapped I/O (Asynchronous I/O) using ReadFileEx, where Windows does a callback when data is ready to read.

However, the actual thousands of calls to CreateFile itself are still a bottleneck. Note that I supply my own buffers, turn on the NO_BUFFERING flag, give the SERIAL hint, etc. However, the calls to CreateFile take several 10s of seconds, whereas on linux everything is done much faster.

Is there anything that can be done to get these files ready for reading more quickly?

The call to CreateFile is:

            hFile = CreateFile(szFullFileName,
                GENERIC_READ,
                FILE_SHARE_READ | FILE_SHARE_WRITE,
                NULL,
                OPEN_EXISTING,
                FILE_ATTRIBUTE_NORMAL | FILE_FLAG_OVERLAPPED | FILE_FLAG_NO_BUFFERING | FILE_FLAG_SEQUENTIAL_SCAN,
                NULL);
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
987 views
Welcome To Ask or Share your Answers For Others

1 Answer

CreateFile in kernel32.dll has some extra overhead compared to the kernel syscall NtCreateFile in ntdll.dll. This is the real function that CreateFile calls to ask the kernel to open the file. If you need to open a large number of files, NtOpenFile will be more efficient by avoiding the special cases and path translation that Win32 has-- things that wouldn't apply to a bunch of files in a directory anyway.

NTSYSAPI NTSTATUS NTAPI NtOpenFile(OUT HANDLE *FileHandle, IN ACCESS_MASK DesiredAccess, IN OBJECT_ATTRIBUTES *ObjectAttributes, OUT IO_STATUS_BLOCK *IoStatusBlock, IN ULONG ShareAccess, IN ULONG OpenOptions);

HANDLE Handle;
OBJECT_ATTRIBUTES Oa = {0};
UNICODE_STRING Name_U;
IO_STATUS_BLOCK IoSb;

RtlInitUnicodeString(&Name_U, Name);

Oa.Length = sizeof Oa;
Oa.ObjectName = &Name_U;
Oa.Attributes = CaseInsensitive ? OBJ_CASE_INSENSITIVE : 0;
Oa.RootDirectory = ParentDirectoryHandle;

Status = NtOpenFile(&Handle, FILE_READ_DATA, &Oa, &IoSb, FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, FILE_SEQUENTIAL_ONLY);

Main downside: this API is not supported by Microsoft for use in user mode. That said, the equivalent function is documented for kernel mode use and hasn't changed since the first release of Windows NT in 1993.

NtOpenFile also allows you to open a file relative to an existing directory handle (ParentDirectoryHandle in the example) which should cut down on some of the filesystem overhead in locating the directory.

In the end, NTFS may just be too slow in handling directories with large numbers of files as Carey Gregory said.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...