File Operations in Python: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
(Created page with "=Internal= * Python Code Examples =Check whether a File Exists=")
 
 
(67 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Internal=
=Internal=
* [[Python Code Examples#Code_Examples|Python Code Examples]]
* [[Python Code Examples#Code_Examples|Python Code Examples]]
* [[Python_Module_os#Working_Directory|os]]
* [[Python_Module_shutil#Overview|shutil]]
=TODO=
<font color=darkkhaki>
* TO PROCESS [[PyOOP]] "File I/O" + "Placing it in context"
* TO PROCESS [[PyOOP]] "Filesystem paths"
</font>


=Check whether a File Exists=
=Check whether a File Exists=
Use either or <code>pathlib</code> <code>[[#exists.28.29.2C_is_file.28.29.2C_is_dir.28.29|exists(), is_file(), is_dir()]]</code> or <code>[[#exists.28path_to_file.29|os.path.exists()]]</code>.
=Reading/Writing from/to Files=
==The <tt>open()</tt> Built-in==
<code>open()</code> is a [[Python Language Functions#open|built-in function]].
==Read==
<font color=darkkhaki>
Understand this idiom. What does <code>with</code> do? Does it automatically close the file when it exits the block? Apparently this is a "context manager".
<syntaxhighlight lang='python'>
with open('somefile.txt', 'rt') as f:
  text = f.read()
  print(text)
</syntaxhighlight >
</font>
<syntaxhighlight lang='python'>
f = open(''filename'', ''mode'')
c = f.read()
f.close()
</syntaxhighlight >
<syntaxhighlight lang='python'>
f = open('somefile', 'rt')
c = f.read()
f.close()
</syntaxhighlight>
<font color=darkkhaki>
Mode: "r", "w", "x", etc. "t" text, "b" binary
</font>
==Write==
<syntaxhighlight lang='python'>
f = open('/Users/ovidiu/tmp/out.json', 'wt')
f.write("test\n")
f.close()
</syntaxhighlight>
To read the content being written, before closing, you might need to <code>flush()</code>.
=Working Directory=
<syntaxhighlight lang='python'>
import os
print('getcwd:', os.getcwd())
</syntaxhighlight>
Also see: {{Internal|Python_Module_os#Working_Directory|<tt>os</tt>}}
=The Path of the Running Script File=
<syntaxhighlight lang='python'>
print('__file__:', __file__)
</syntaxhighlight>
=Paths=
<code>os.path.basename</code> returns the file name from the file path:
<syntaxhighlight lang='python'>
import os
print(os.path.basename(__file__))
</syntaxhighlight>
<code>os.path.dirname</code> returns the directory name from the file path.
<syntaxhighlight lang='python'>
import os
print(os.path.dirname(__file__))
</syntaxhighlight>
<code>os.path.abspath</code> return the absolute path from a file path.
<font color=darkkhaki>
<code>os.path.splittext</code> returns the file name from the file path.
</font>
<font color=darkkhaki>Use the pathlib module to extract directory name.</font>
==Path Operations with <tt>os.path</tt>==
To join two path fragments using the OS-native path separator use <code>os.path.join()</code>:
<syntaxhighlight lang='python'>
path1 = 'a'
path2 = 'b'
path3 = 'c'
assert 'a/b/c' == os.path.join(path1, path2, path3)
</syntaxhighlight>
=Removing Files=
<syntaxhighlight lang='python'>
import os
os.remove("somefile.txt")
os.rmdir("somedir") # removes an empty directory
shutil.rmtree() # deletes a directory and all its contents.
</syntaxhighlight>
<code>Path</code> objects from the Python 3.4+ <code>pathlib</code> module also expose these instance methods:
<syntaxhighlight lang='python'>
pathlib.Path.unlink()  # removes a file or symbolic link.
pathlib.Path.rmdir() # removes an empty directory.
</syntaxhighlight>
=Recursively Copy a Directory=
{{Internal|Python_Module_shutil#Recursively_Copy_a_Directory|Recursively Copy a Directory with <tt>shutil</tt>}}
=<span id='Temporary_Files'></span>Temporary Files and Directories=
{{Internal|Python Temporary Files and Directories|Temporary Files and Directories}}
=<tt>pathlib</tt>=
<code>Path</code> represents a filesystem path that offers methods to do system calls on path objects. Depending on your system, instantiating a <code>Path</code> will return either a <code>PosixPath</code> or a <code>WindowsPath</code> object. You can also instantiate a <code>PosixPath</code> or <code>WindowsPath</code> directly, but cannot instantiate a <code>WindowsPath</code> on a POSIX system or vice versa.
New <code>Path</code> instance can be constructed from a <code>Path</code> instance:
<syntaxhighlight lang='python'>
path = Path('.')
path2 = Path(path, './some-file.txt')
</syntaxhighlight>
Convert the <code>Path</code> to a string with <code>str()</code>:
<syntaxhighlight lang='python'>
path = Path('.')
print(str(path))
</syntaxhighlight>
====Accessing the File Name====
<syntaxhighlight lang='python'>
Path('...').name
</syntaxhighlight>
====Accessing the Parent====
<syntaxhighlight lang='python'>
Path('...').parent
</syntaxhighlight>
====Accessing the Full Path====
<syntaxhighlight lang='python'>
str(Path('...'))
</syntaxhighlight>
====<tt>resolve(strict=False)</tt>====
Relative paths ("../../..") can be "resolved" with:
<syntaxhighlight lang='python'>
path = Path('/Users/ovidiu/..')
print(path.resolve()) # will display "/Users"
</syntaxhighlight>
====<tt>mkdir(mode=0o777, parents=False, exist_ok=False)</tt>====
Create a directory, including its non-existent parents if required.
<syntaxhighlight lang='py'>
d = Path('somedir')
d.mkdir(0o700, True, False)
</syntaxhighlight>
Setting <code>parents</code> to <code>True</code> will create intermediate missing directories if necessary. By default, <code>parents</code> is <code>False</code>.
The method fails if the directory already exists, unless <code>exist_ok</code> is set to <code>True</code>.
====<tt>exists(), is_file(), is_dir()</tt>====
<syntaxhighlight lang='python'>
from pathlib import Path
path = Path(path_to_file)
path.exists()
path.is_file()
path.is_dir()
</syntaxhighlight>
====<tt>rmdir()</tt>====
<syntaxhighlight lang='python'>
from pathlib import Path
path = Path(path_to_dir)
path.rmdir()
</syntaxhighlight>
The directory must be empty. <code>shutil</code> has a function that [[Python_Module_shutil#Recursively_Delete_a_Directory|deletes the directory recursively]].
====Remove a file or a symbolic link====
<syntaxhighlight lang='python'>
from pathlib import Path
path = Path(path_to_file)
path.unlink()
</syntaxhighlight>
By default, the call will raise a <code>FileNotFoundError</code> if the file does not exist. To mute this behavior, use <code>unlink(missing_ok=True)</code>.
====<tt>iterdir()</tt>====
Iterate over the files and directories in the given directory.  Does not yield any result for the special paths '.' and '..'.
<syntaxhighlight lang='python'>
from pathlib import Path
path = Path(path_to_dir)
for f in path.iterdir():
  if f.is_file():
    ...
  elif f.is_dir():
    ...
</syntaxhighlight>
====<tt>touch()</tt>====
Creates the file.
===Other <tt>pathlib</tt> Methods===
* <code>cwd()</code>
* <code>home()</code>
* <code>samefile(other_path)</code>
* <code>glob(pattern)</code>
* <code>rglob(pattern)</code>
* <code>absolute()</code>
* <code>stat()</code>
* <code>group()</code>
* <code>open(mode='r', buffering=-1, encoding=None, errors=None, newline=None)</code>
* <code>read_bytes()</code>
* <code>read_text(encoding=None, errors=None)</code>
* <code>write_bytes(data)</code>
* <code>write_text(data, encoding=None, errors=None)</code>
* <code>touch(mode=0o666, exist_ok=True)</code>
* <code>chmod(mode)</code>
* <code>lchmod(mode)</code>
* <code>unlink(missing_ok=False)</code>
* <code>lstat()</code>
* <code>link_to(target)</code>
* <code>rename(target)</code>
* <code>replace(target)</code>
* <code>symlink_to(target, target_is_directory=False)</code>
* <code>is_mount()</code>
* <code>is_symlink()</code>
* <code>is_block_device()</code>
* <code>is_char_device()</code>
* <code>is_fifo()</code>
* <code>is_socket()</code>
* <code>expanduser()</code>
=<tt>os.path</tt>=
====<tt>exists(path_to_file)</tt>====
<syntaxhighlight lang='python'>
import os.path
file_exists = os.path.exists(path_to_file)
</syntaxhighlight>
Returns <code>True</code> or <code>False</code>.

Latest revision as of 18:50, 21 June 2023

Internal

TODO

  • TO PROCESS PyOOP "File I/O" + "Placing it in context"
  • TO PROCESS PyOOP "Filesystem paths"

Check whether a File Exists

Use either or pathlib exists(), is_file(), is_dir() or os.path.exists().

Reading/Writing from/to Files

The open() Built-in

open() is a built-in function.

Read

Understand this idiom. What does with do? Does it automatically close the file when it exits the block? Apparently this is a "context manager".

with open('somefile.txt', 'rt') as f:
  text = f.read()
  print(text)

 f = open(''filename'', ''mode'')
 c = f.read()
 f.close()
f = open('somefile', 'rt')
c = f.read()
f.close()

Mode: "r", "w", "x", etc. "t" text, "b" binary

Write

f = open('/Users/ovidiu/tmp/out.json', 'wt')
f.write("test\n")
f.close()

To read the content being written, before closing, you might need to flush().

Working Directory

import os
print('getcwd:', os.getcwd())

Also see:

os

The Path of the Running Script File

print('__file__:', __file__)

Paths

os.path.basename returns the file name from the file path:

import os
print(os.path.basename(__file__))

os.path.dirname returns the directory name from the file path.

import os
print(os.path.dirname(__file__))

os.path.abspath return the absolute path from a file path.

os.path.splittext returns the file name from the file path.

Use the pathlib module to extract directory name.

Path Operations with os.path

To join two path fragments using the OS-native path separator use os.path.join():

path1 = 'a'
path2 = 'b'
path3 = 'c'
assert 'a/b/c' == os.path.join(path1, path2, path3)

Removing Files

import os
os.remove("somefile.txt")
os.rmdir("somedir") # removes an empty directory
shutil.rmtree() # deletes a directory and all its contents.

Path objects from the Python 3.4+ pathlib module also expose these instance methods:

pathlib.Path.unlink()  # removes a file or symbolic link.
pathlib.Path.rmdir() # removes an empty directory.

Recursively Copy a Directory

Recursively Copy a Directory with shutil

Temporary Files and Directories

Temporary Files and Directories

pathlib

Path represents a filesystem path that offers methods to do system calls on path objects. Depending on your system, instantiating a Path will return either a PosixPath or a WindowsPath object. You can also instantiate a PosixPath or WindowsPath directly, but cannot instantiate a WindowsPath on a POSIX system or vice versa.

New Path instance can be constructed from a Path instance:

path = Path('.')
path2 = Path(path, './some-file.txt')

Convert the Path to a string with str():

path = Path('.')
print(str(path))

Accessing the File Name

Path('...').name

Accessing the Parent

Path('...').parent

Accessing the Full Path

str(Path('...'))

resolve(strict=False)

Relative paths ("../../..") can be "resolved" with:

path = Path('/Users/ovidiu/..')
print(path.resolve()) # will display "/Users"

mkdir(mode=0o777, parents=False, exist_ok=False)

Create a directory, including its non-existent parents if required.

d = Path('somedir')
d.mkdir(0o700, True, False)

Setting parents to True will create intermediate missing directories if necessary. By default, parents is False.

The method fails if the directory already exists, unless exist_ok is set to True.

exists(), is_file(), is_dir()

from pathlib import Path
path = Path(path_to_file)
path.exists()
path.is_file()
path.is_dir()

rmdir()

from pathlib import Path
path = Path(path_to_dir)
path.rmdir()

The directory must be empty. shutil has a function that deletes the directory recursively.

Remove a file or a symbolic link

from pathlib import Path
path = Path(path_to_file)
path.unlink()

By default, the call will raise a FileNotFoundError if the file does not exist. To mute this behavior, use unlink(missing_ok=True).

iterdir()

Iterate over the files and directories in the given directory. Does not yield any result for the special paths '.' and '..'.

from pathlib import Path
path = Path(path_to_dir)
for f in path.iterdir():
  if f.is_file():
     ...
  elif f.is_dir():
     ...

touch()

Creates the file.

Other pathlib Methods

  • cwd()
  • home()
  • samefile(other_path)
  • glob(pattern)
  • rglob(pattern)
  • absolute()
  • stat()
  • group()
  • open(mode='r', buffering=-1, encoding=None, errors=None, newline=None)
  • read_bytes()
  • read_text(encoding=None, errors=None)
  • write_bytes(data)
  • write_text(data, encoding=None, errors=None)
  • touch(mode=0o666, exist_ok=True)
  • chmod(mode)
  • lchmod(mode)
  • unlink(missing_ok=False)
  • lstat()
  • link_to(target)
  • rename(target)
  • replace(target)
  • symlink_to(target, target_is_directory=False)
  • is_mount()
  • is_symlink()
  • is_block_device()
  • is_char_device()
  • is_fifo()
  • is_socket()
  • expanduser()

os.path

exists(path_to_file)

import os.path
file_exists = os.path.exists(path_to_file)

Returns True or False.