onlyforbopi
4/28/2017 - 8:58 AM

Python.Debug.ErrorCheck

Python.Debug.ErrorCheck #python #Python #debug #Debug #Error #ErrorCheck

PYTHON DEBUGGING

1. charmaperror              - How to solve charmap error, cant encode to unicode
Traceback (most recent call last):
  File "C:/Users/Andres/Desktop/scrap/scrap.py", line 444, in <module>
    dar_p_fisica()
  File "C:/Users/Andres/Desktop/scrap/scrap.py", line 390, in dar_p_fisica
    print(datos.text) #.encode().decode('ascii', 'ignore')
  File "C:\Python34\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2010' in position 173: character maps to <undefined>


#In command prompt:

chcp 65001

# Other codepages

CHCP.com

Change the active console Code Page. The default code page is determined by the Windows Locale.

Syntax
      CHCP code_page

Key
   code_page  A code page number (e.g. 437)  
This command is rarely required as most GUI programs and PowerShell now support Unicode. When working with characters outside the ASCII range of 0-127, the choice of code page will determine the set of characters displayed.

Programs that you start after you assign a new code page will use the new code page, however, programs (except Cmd.exe) that you started before assigning the new code page will use the original code page.

Code page	Country/ Region/ Language
437	United States
850	Multilingual (Latin I)
852	Slavic (Latin II)
855	Cyrillic (Russian)
857	Turkish
860	Portuguese
861	Icelandic
863	Canadian-French
865	Nordic
866	Russian
869	Modern Greek
1252	West European Latin
65000	UTF-7 *
65001	UTF-8 *
* The 65000/1 code pages are encoded as UTF-7/8 to allow to working with unicode data in 7-bit and 8-bit environments, however

Even if you use CHCP to run the Windows Console in a unicode code page, many applications will assume that the default still applies, e.g. Java requires the-Dfile option: java -Dfile.encoding=UTF-8

Unicode characters will only display if the current console font contains the characters. So use a TrueType font like Lucida Console instead of the CMD default Raster Font.

The CMD Shell (which runs inside the Windows Console)
CMD.exe only supports two character encodings Ascii and Unicode (CMD /A and CMD /U)

If you need full unicode support use PowerShell. There is still VERY limited support for unicode in the CMD shell, piping, redirection and most commands are still ANSI only. The only commands that work are DIR, FOR /F and TYPE, this allows reading and writing (UTF-16LE / BOM) files and filenames but not much else.

Defaults
The default code page in the USA is 437, the default in most of Europe is 850. The number of supported code pages was greatly increased in Windows 7. For a full list of code pages supported on your machine, run NLSINFO (Resource Kit Tools)

Files saved in Windows Notepad will be in ANSI format by default, but can also be saved as Unicode UTF-16LE or UTF -8 and for unicode files, will include a BOM. 
A BOM will make a batch file not executable on Windows, so batch files must be saved as ANSI, not Unicode.

Examples:

View the current code page:
chcp

Change the code page to Unicode/65001:
chcp 65001

“Remember that there is no code faster than no code” ~ Taligent's Guide to Designing Programs